Hackipedia hand edited PDF to text conversion. This file is UTF-8, please make sure your text viewer supports it. --------------------------------------------------------------- Introduction ------------ Word for Windows uses the same basic file format for its document, glossary, and autosave files. This document describes the Word for Windows document format with additional comments explaining the differences for glossary and autosave files. The most important sections of a Word document are the text and formatting sections. The text section is straight ANSI text, although some of the low order characters have been reserved for special use, such as forced line feeds and page breaks. No formatting "reveal" codes exist in the text section. In a Word for Windows document, formatting is stored in the special sections and related to the text by sequential tables. Extracting textual information from a Word for Windows document is very simple: read the ANSI text section and ignore the formatting section. A list of differences between the file format for Word for Windows versions 1.x and 2.0 has been added to this document as Appendix A. Table of Contents Introduction ............................................................................................................................................... 1 Table of Contents....................................................................................................................................... 1 Definitions ................................................................................................................................................. 3 page (or sector):........................................................................................................................... 3 document: .................................................................................................................................... 3 file: .............................................................................................................................................. 3 CP (Character Position):.............................................................................................................. 3 FC( File Character position):...................................................................................................... 3 PLCF(PLex of Cps(or FCs) stored in File): ................................................................................ 3 piece table: .................................................................................................................................. 4 sprm (Single PRoperty Modifier):............................................................................................... 4 grpprl (group of prls):.................................................................................................................. 4 prm (PRoperty Modifier): ........................................................................................................... 4 full-saved (or non-complex) file:................................................................................................. 5 fast-saved (or complex) file:........................................................................................................ 5 FIB (File Information Block): ..................................................................................................... 5 paragraph..................................................................................................................................... 5 run of text .................................................................................................................................... 5 section ......................................................................................................................................... 5 style ............................................................................................................................................. 5 CHP (CHaracter Properties) ........................................................................................................ 5 CHPX (Character Property EXception) ...................................................................................... 5 PAP (PAragraph Properties) ....................................................................................................... 5 PAPX (PAragraph Property EXception) ..................................................................................... 6 table row:..................................................................................................................................... 6 TAP (TAble Properties): ............................................................................................................. 6 STSH (STyle SHeet) ................................................................................................................... 6 FKP (Formatted disK Page): ....................................................................................................... 6 bin table....................................................................................................................................... 7 SEP(SEction Properties).............................................................................................................. 7 SEPX(SEction Property EXceptions).......................................................................................... 7 DOP (DOcument Properties)....................................................................................................... 7 sub-document .............................................................................................................................. 8 field ............................................................................................................................................. 8 Naming Conventions ................................................................................................................................. 9 Non-Complex File Format......................................................................................................................... 10 Complex File Format................................................................................................................................. 12 File Information Block (FIB)..................................................................................................................... 14 Text............................................................................................................................................................ 14 Character and Paragraph Formatting Properties ........................................................................................ 16 Bin Tables.................................................................................................................................................. 17 Style Sheets ............................................................................................................................................... 18 SPRM Definitions...................................................................................................................................... 21 Complex File Format................................................................................................................................. 30 Algorithm to determine the bounds of a paragraph containing a certain character in a complex file31 Algorithm to determine paragraph properties for a paragraph in a complex file ........................ 32 Algorithm to determine table properties for a table row in a complex file.................................. 32 Algorithm to determine the character properties of a character in a complex file....................... 32 Algorithm to determine the section properties of a section in a complex file ............................. 32 Algorithm to determine the PIC of a picture in a complex file. .................................................. 33 Footnotes ................................................................................................................................................... 33 Headers and Footers .................................................................................................................................. 33 Page Table ................................................................................................................................................. 35 Glossary Files ............................................................................................................................................ 35 sttbfAssoc (Table of Associated Strings)................................................................................................... 36 Structure Definitions.................................................................................................................................. 36 BRC: Border Code ...................................................................................................................... 36 BRC10: Border Code for Word for Windows 1.0....................................................................... 37 CHP/CHPX: Character Properties............................................................................................... 37 CHP10/CHPX: Character Properties for Word for Windows 1.0 ............................................... 40 DOP: Document Properties ......................................................................................................... 42 DTTM: Date and Time (internal date format)............................................................................. 43 FIB: File Information Block........................................................................................................ 43 FKP: Formatted Disk Page.......................................................................................................... 49 FLD: Field Descriptor ................................................................................................................. 49 OBJHEADER: Embedded Object Properties.............................................................................. 50 PAP: Paragraph Properties .......................................................................................................... 51 PAPX: Paragraph Property Exceptions ....................................................................................... 54 PCD: Piece Descriptor................................................................................................................. 54 PGD: Page Descriptor ................................................................................................................. 55 PHE: Paragraph Height ............................................................................................................... 55 PIC: Picture Descriptor ............................................................................................................... 56 PLCF: Plex of CPs stored in File ................................................................................................ 57 PRM: Property Modifier ............................................................................................................. 57 PRM: Property Modifier (variant 1).............................................................................. 57 PRM: Property Modifier (variant 2).............................................................................. 58 SED: Section Descriptor ............................................................................................................. 58 SEP: Section Properties............................................................................................................... 58 SEPX: Section Property Exceptions............................................................................................ 60 TAP: Table Properties ................................................................................................................. 60 TBD: Tab Descriptor................................................................................................................... 61 TC: Table Cell Descriptors.......................................................................................................... 61 Appendix A - Changes from version 1.x to 2.0 ......................................................................................... 62 Changes to Structures.................................................................................................................. 62 BRC .............................................................................................................................. 62 CHP............................................................................................................................... 62 DOP .............................................................................................................................. 62 OBJHEADER ............................................................................................................... 62 PAP ............................................................................................................................... 62 PGD .............................................................................................................................. 62 PIC ................................................................................................................................ 63 SED............................................................................................................................... 63 SEP................................................................................................................................ 63 TAP............................................................................................................................... 63 TC ................................................................................................................................. 63 Other changes.............................................................................................................................. 63 Autosave Source ........................................................................................................... 63 Embedded Objects ........................................................................................................ 63 Hand Annotation ........................................................................................................... 63 New Sprm definitions ................................................................................................... 63 sttbfAssoc...................................................................................................................... 63 sttbfFn ........................................................................................................................... 63 Index of Changes from version 1.x to 2.0 ................................................................................... 63 Appendix B: Revision History................................................................................................................... 65 Definitions ----------- page (or sector): 512 byte segment of a Word for Windows file that begins on a 512-byte boundary. (bytes 0-511 are in page 0, bytes 512-1023 are in page 1, etc.). In Word data structures, an unsigned two-byte integer page number is given the acronym PN (for Page Number). document: A named, multi-linked list of data structures, representing an ordered stream of text with properties that was produced by a user of Microsoft Word file: The physical encoding of a Word document 's text and sub data structures in a random access file CP (Character Position): A four-byte integer which is the position coordinate of a character of text within the logical text stream of a document. FC( File Character position): A four-byte integer which is the byte offset of a character (or other object) from the beginning of the file. Before a file has been edited(i.e. in a full saved Word document), CPs can be transformed into FCs by adding the FC coordinate of the beginning of a document's text stream to the CP. After a file has been edited (i.e. in a fast-saved Word document), the mapping from CP to FC is recorded in the piece table (see below) PLCF(PLex of Cps(or FCs) stored in File): A data structure consisting of two parallel arrays that allows a relation to be established between a certain CP position in the document text stream (or FC position in a file) and an arbitrary data structure. It consists of an array of n+1 CPs or FCs followed by an array of n instances of a particular arbitrary data structure. In typical usage, the nth CP or FC of the PLCF is in one-to-one correspondence with the nth instance of the arbitrary data structure, with the n+1st CP or FC marking the limit of the nth instance's influence. When a PLCF is used to record a partitioning of the document's text stream or a partitioning of the bytes stored in a of the 0th mark or link. To properly interpret a PLCF stored in a Word file, the length of the stored PLCF and the length of the arbitrary data structure stored in the PLCF must be known. The length of the stored PLCF is recorded in the FIB. The lengths of the data structures stored in PLCFs within Word files are listed later in this document. piece table: The piece table is a data structure that describes the logical sequence of characters in a Word document and records recent changes to the formatting of a Word document. It is stored in a Word file as a PLCF named the plcfpcd (PLex of Cps containing Piece Descriptors).The piece table relates a logical character number, called a CP (Character Position), to a physical location within a Word file (an FC). The array of CPs in the plcfpcd defines a partitioning of the Word document into disjoint pieces. The second array is an array of PCDs (Piece Descriptors) which is in 1-to-1 correspondence to the array of CPs that records the physical location in the Word file where the corresponding piece begins. To find the physical location of a particular logical character in a Word document, take the CP coordinate of that character within the document and find the piece that contains that character. This is done by finding the index of the largest CP in the array of CPs that is less than the character CP. Then reference the PCD with that index in the array of PCDs. The FC stored in the PCD gives the position of the beginning of the piece in the file. Finally, add the offset of the desired character from the beginning of its piece to the FC of the beginning of the piece. This gives the actual file offset of the character. sprm (Single PRoperty Modifier): An instruction to modify one or more properties within one of the property defining data structures (CHP, PAP, TAP, SEP, or PIC). It consists of an operation code which identifies the field(s) to be changed, and an operand which gives the value that a particular field is changed to or else which is a parameter to a procedure which will change the field or fields. The operand is omitted for sprms whose opcodes completely specify the values that must be stored in the property data structure. A synonym used for sprm in some data structure definitions is prl (property modifiers stored in a list). grpprl (group of prls): A grpprl is a data structure that records a set of sprms. The 0th sprm is recorded at offset 0 of the structure. Any succeeding sprms are recorded immediately after the end of the preceding sprm . To traverse a grpprl and locate the sprms recorded within it, it’s necessary to fetch the opcode of the first sprm, lookup the length of the sprm with that opcode, use that length to skip past the first sprm, fetch the opcode of the second sprm, lookup the length of that sprm, use the length to skip the second sprm, and so on. See the table in the “SPRM Definition” topic to determine the length of a sprm. The phrase “apply the sprms of a grpprl (or PAPX or SEPX)” used later in this document means to fetch the 0th sprm recorded in the grpprl and perform the action for that sprm, fetch the first sprm and perform its action, and continue this procedure until all sprms in the grpprl (or PAPX or SEPX) have been processed. prm (PRoperty Modifier): A field in piece table entries that records how the properties of text within a piece were changed to reflect user formatting operations. The prm usually contains an index to a grpprl which records the user’s formatting changes as a group of sprms. If the user has made only a small change to formatting that can be expressed as a single 2 or 1-byte sprm, that sprm is stored within the prm. full-saved (or non-complex) file: A Word file in which the physical order of characters stored in the file is identical to the logical order of characters in the document that the file represents. The text stream of a non-complex file can be described by an fc (an offset from the beginning of the file) to mark where the text begins and a ccp (count of CPs) to A Word file in which the physical order of characters stored in the file does not match the logical order of characters in the document that the file represents. A piece table must be stored in the file to describe the text stream of the document. FIB (File Information Block): The header of a Word for Windows file. Begins at offset 0 in file. Gives the beginning offset and lengths of the document's text stream and subsidiary data structures within the file. Also stores other file status information. paragraph A contiguous sequence of characters within the text stream of a document that is delimited by a paragraph mark, cell mark, row mark, or a section mark (These are special characters described later in this document). run of text A contiguous sequence of characters within the text stream of a document that have the same character formatting properties. A single run may cross paragraph boundaries and may encompass the entire document. section A contiguous sequence of paragraphs within the text stream of a document that is delimited by a section mark or by the final paragraph mark at the end of a document. Users frequently treat sections as the equivalent of a chapter in a book. The boundaries of sections mark locations where the layout rules for a document (number of columns, text of headers and footers to use, whether page numbers should be displayed, etc.) are changed. style A named set of character and paragraph properties that can be associated with any number of paragraphs in a Word document's text stream. A style provides a set of property defaults for any paragraph tagged with that style. When a new paragraph is created and given a particular style, newly typed text is given the character and paragraph properties of that style unless the user makes an exception to the style definition. CHP (CHaracter Properties) The data structure describing the character properties of a run of text. CHPX (Character Property EXception) A data structure with the same form as a CHP but which has different semantics. It describes how the properties of a run of text differ from the character properties of the styles of paragraphs that contain the run. By applying a CHPX to the character properties (CHP) inherited by a particular paragraph from its style, it is possible to reconstitute the CHP for the portion of the character run that intersects that paragraph. PAP (PAragraph Properties) The data structure which describes the properties of a particular paragraph. PAPX (PAragraph Property EXception) A data structure describing how a particular paragraph’s properties differ from the paragraph properties of the style assigned to the paragraph. By applying a PAPX to the paragraph properties (PAP) inherited by a particular paragraph from its style, it is possible to reconstitute the PAP for that paragraph. The PAPX contains an STC (a style code to identify the style in control of the paragraph), paragraph height information, and a grpprl which specifies how the style's paragraph properties must be changed to produce sequences of paragraphs called cells. The last paragraph of each cell is terminated by a special paragraph mark called a cell mark. Following the cell mark that ends the last cell of a table row, the table row is terminated by a special paragraph mark called a row mark. When Word displays a table row, it assigns a rectangular shaped display area to each cell in the row. All of the cell display area’s top’s are aligned at the same vertical position on a page. The leftmost display area in a table row is assigned to the 0th cell of the row; the next display area to the right is assigned to the 1st cell of the row, etc. The text of the cell is wrapped to fit its display areas. As more text is added to the cell, the cell display area extends downward. A set of table properties that determine how many cells are in a row, where the horizontal boundaries of cell display areas are, and what borders are drawn around each cell in the table is stored for the row mark that marks the end of the table row. TAP (TAble Properties): The data structure which describes the properties of a single table row. The information in the TAP for a table row is stored in a Word file as a list of sprms that modify a TAP which has been cleared to zeros. This list of table sprms is appended to the grpprl of paragraph sprms that is recorded in the PAPX for the row mark that delimits the end of a table row. STSH (STyle SHeet) A data structure which represents every style defined within the Word document. The STSH records a unique name string for every style and associates each name with a particular CHP and PAP. The indexes used to refer to individual styles are called STCs (STyle Codes). Every PAPX for every paragraph recorded in a document contains an STC which identifies the style from which a paragraph inherited its default character and paragraph properties. CHPXs recorded for the text within the paragraph and PAPXs recorded for the paragraph itself encode changes that the user has made with respect to the style’s default properties. FKP (Formatted disK Page): A data structure that fits in one 512-byte page that encodes either the character properties or the paragraph properties of a certain portion of a Microsoft Word file. An FKP consists of four components: 1) a count of the number of runs or paragraphs described by the page. 2) an array of FCs recorded in ascending order demarcating the boundaries between runs or paragraphs that are recorded adjacent to one another in the Word file. 3) an array of offsets within the FKP in one to one correspondence with the array of FCs that locate the properties of the run or paragraph that begins at a particular FC. 4) a group of CHPXs if the FKP stores character properties or a group of PAPXs if the FKP stores paragraph and table properties. To find the CHPX/PAPX corresponding to a particular character in a document, calculate the FC coordinate for that character. Then search the FKPs that encode the type of property you want to produce, to find the FKP whose array of FCs encompasses the FC of the document character. Then search within the FKP to find the index of the largest FC entry that is less than or equal to the FC of the document character. Use this index to look up an offset in the array of offsets within the FKP. Add this offset to the beginning address of the FKP in memory. This will be the first byte of the desired CHPX/PAPX. bin table Each FKP can be viewed as bucket or bin that contains the properties of a certain range of FCs in the Word file. In Word files, a PLC, the plcfbte (PLex of FCs containing Bin Table Entries) is maintained. It records the association between a particular range of FCs and the PN (Page Number) of the FKP that contains the plcfbtePapx which records the location of every PAPX FKP must be stored. In a non-complex, full-saved document, all of the CHPX FKPs are recorded in consecutive 512-byte pages with the FKPs recorded in ascending FC order, as are all of the PAPX FKPs. In a non-complex document, at least the first FKP page number will be recorded so that the beginning of the consecutive range of pages may be located. However, the bin table may be incomplete because of resource constraints placed on Word's save procedures. If a plcfbte is incomplete, the page numbers of the first n FKPs will be recorded but the last m FKPs would not be represented. The complete plcfbte may be reconstructed by the reader because the total number of CHPX FKPs and PAPX FKPs is recorded in the FIB. When a reader notices that the number of entries in a plcfbte is less than the number of FKP pages that was recorded in the FIB, the reader must locate the last PN recorded in the plcfbte, call it pnLast. If the number of missing page entries is m, the reader would have to read pages pnLast + 1 through pnLast + m and record the first fc stored in each of the tables plus the last fc of page pnLast + 1 to produce a complete plcfbte. SEP(SEction Properties) The data structure describing the properties of a particular section. SEPX(SEction Property EXceptions) A data structure describing how the properties of a particular section differ from a Word-defined standard SEP. As in the PAPX, the differences between the SEP for a section and the standard SEP are encoded as list of sprms that describe how the standard SEP can be transformed into the section's SEP. By applying a SEPX's sprms to the standard SEP, it is possible to reconstitute the SEP for that section. The PLCFSED, a data structure stored in a Word file, records the locations of all SEPXs stored in a Word file. The array of CPs in the plcfsed records the boundaries of sections in the Word document . The second array in the plcf, an array of SEDs (SEction Descriptors), is in 1-to-1 correspondence to the array of CPs. Each SED stores the beginning FC of the SEPX that records the properties for a section. If the FC stored in a SED is -1, the section properties of the section are exactly equal to the standard section properties. The SEP for a particular section may be constructed if a CP of a character in that section is known. First search the array of CPs in the PLCSED for the index of the largest CP that is less than or equal to the CP of the character. Use this index to locate the SED in the plcfsed which describes the section. The FC stored in the SED is the offset from the beginning of the Word file at which the SEPX is stored. If the stored FC is equal to 0xFFFFFFFF, then the SEP for the section is exactly equal to the standard SEP (see SEP structure definition) Otherwise, read the SEPX into memory and create a copy of the standard SEP. Finally, apply the sprms stored in the SEPX to the standard SEP to produce the SEP for a section. DOP (DOcument Properties) The data structure describing properties that apply to the document as a whole. sub-document A separate logical stream of text with properties for which correspondences with the main document text are maintained. Word's headers/footers, footnotes, macro procedure text, and annotation text are kept in separate subdocuments. Each subdocument has its own CP coordinate space. In other words, data structures are stored in Word files that are components of these subdocuments. These data structures contain CP coordinates whose 0 point is the beginning of the subdocument text stream instead of the beginning of the main document text stream. In full-saved documents, a simple calculation with values stored in the FIB produces the file offset of the beginning of the subdocument text streams (if they exist). The length of these streams is also stored. In fast-saved documents, the piece tables of subdocuments are concatenated to the end of the main document piece table. In this case, to identify the beginning of subdocument text , you must sum the length CP coordinate, to find the physical location of each piece of the subdocument text stream. field A field is a two-part structure that may be recorded in the CP stream of a document. The first part of the structure contains field codes which instruct Window's Word to insert text into the second part of the structure, the field result. Fields in Window's Word are used to insert text from an external file or to quote another part of a document, to mark index and table of contents entries and produce indexes and tables of contents, maintain DDE links to other programs, to produce dates, times, page numbers, sequence numbers, etc. There are 56 different field types. A field begin mark delimits the beginning of a field and precedes any of the field codes stored in the field. The end of the field codes and the beginning of the field result is marked with the field separator and the field result and the field itself are terminated by a field end mark. The CP locations of the field begin mark, field separator, and field end mark are recorded in plcfld data structures that are maintained for the main document and all of the subdocuments of the main document whenever a field is inserted or edited. An array of two-byte FLD structures is stored in the plcfld in one-to- one correspondence with the CP entries recorded. An FLD associated with a field begin mark records the type of the field. An FLD associated with the field end mark records the current status of the field (i.e. whether the result is dirty or has been edited, whether the result has been locked, etc.) Fields may be nested. 20 levels of nesting are permitted. bookmark A bookmark associates a user definable name with a range of text within a document. A bookmark is frequently used as an operand in field code instructions within a field. In Window's Word a bookmark is represented by three parallel data structures, the sttbBkmk, the plcbkf and the plcbkl. The sttbBkmk is a string table which contains the name of each bookmark that is defined. The plcbkf records the beginning CP position of each bookmark. The plcbkl records the limit CP position that delimits the end of a bookmark. Since bookmarks may be nested within one another to any level, the BKF structure stored in the plcbkf consists of a single index which specifies which plcbkl marks the end of the bookmark. Similarly, the BKL structure stored in the plcbkl consists of a single index which specifies which plcbkf marks the beginning of the bookmark. picture A picture is represented in the document text stream as a special character, an ASCII 1 whose CHP has the fSpec bit set to 1. The file location of the picture in the Word binary file is stored in the character’s CHP in chp.fcPic. For Word for Windows, a picture may be a Window's metafile, a bitmap or a reference to a TIFF file. Beginning at the position recorded in chp.fcPic, a header data structure, the PIC, will be stored. If the picture is a Window's metafile or a bitmap, the metafile or bitmap will immediately follow the PIC. If the picture is a TIFF file, the filename of the TIFF file will be recorded immediately following the PIC. embedded object The native data for Embedded objects (OBJs) is stored similarly to pictures (PICs). To locate the native data for Embedded objects, scan the plc of field codes for the mother, header, footnote and annotation documents (fib.PlcffldMom/Hdr/Ftn/Atn). For each separator field, get the CHP. If chp.fSpec = 1 and chp.fObj = 1, then this separator field has an associated embedded object. The file location of the object data is stored in chp.fcObj. At the specified location an object header is stored followed by the native data for the object. See the OBJHEADER structure. Note: In this document, bit 0 is the low-order bit. Structures are described as they would be declared in C for the Intel architecture. When numbering bytes in a word from low offset towards high offset, two-byte integers will have their least significant eight bits stored in byte 0 and most significant eight bits in byte 1. If bit 31 is the most significant bit in a four-byte integer, bits 31 through 24 will be stored in byte 3 of a four-byte integer, bits 23 Naming Conventions The names in Word data structures usually consist of a lower case sequence of characters followed by an optional upper case modifier. The following tags are used in the lower case parts of field names to document the data type of a field: f used to name a flag (a variable containing a Boolean value). Usually the object referred to will contain either 1 (fTrue, TRUE) or 0 (fFalse, FALSE). (e.g. fWidowControl, fShadow) l used to name a 4 byte integer value ( a long). (e.g. lcb) w used to name a 2 byte integer value (a word). b used to name a 1 byte integer value cp used to name a variable that contains a character position within the document. always a 4 byte quantity. fc used to name a variable that contains an offset from the beginning of a file. always a 4 byte quantity. xa used to name a variable that contains a width of an object imaged on screen or on hard copy that is measured in units of 1/1440 of an inch. This unit which is one-twentieth of a point size (1/20 * 1/72”) is called a twip in this documentation. (e.g. xaPage is the width of a page). ya used to name a variable that contains a height of an object imaged on screen or on hard copy that is measured in twips. dxa used to name a variable that contains the horizontal distance of an object measured from some reference point expressed in twips. (e.g. pap.dxaLeft is the distance of the left boundary of a paragraph measured from the left margin of the page) dya used to name a variable that contains the vertical distance of an object measured from some reference point expressed in twips. (e.g. pap.dyaAbs is the vertical distance of the top of a paragraph from a reference frame declared in the pap). dxp used to name a variable that contains the horizontal distance of an object measured from some reference point expressed in Macintosh pixel units (1/72”). (e.g. dxpSpace) dyp used to name a variable that contains the vertical distance of an object measured from some reference point expressed in Macintosh pixel units (1/72”). rg prefix used to signify that the data structure being defined is an array. (eg.rgb (an array of bytes), rgcp (an array of CPs), rgfc (an array of FCs), rgfoo (an array of foos). i prefix used to signify that an integer value is used as an index into an array. (e.g. itbd is an index into rgtbd, itc is an index into rgtc.) c prefix used to signify that an integer value is a count of some number of objects. (e.g. a cb is a count of bytes, a cl is a count of lines, ccol is a count of columns, a cpe is a count of picture elements.) grp prefix used to name an array of bytes that contains one or more copies of a variable length data structure with the instances of the data structure stored one after the other in the array. (e.g. a grpprl is grpf prefix used to name an integer or byte value whose bits are used as flags. (e.g. grpfIhdt is a group of flags that records the types of headers that are stored for a particular section of a document). The two following modifiers are used occasionally in this documentation: First means that variable marks the first of a range of objects. For example, cpFirst would mark the first character position of a range of characters in a document. fcFirst would mark the file offset of the first byte of a range of bytes stored in a file. Lim means the variable marks the limit of a range of objects (i.e. is the index of the last object in a range plus 1). For example, cpLim would be the limit CP of a range of characters in a document. fcLim would be the limit file offset of a range of bytes stored in a file. Non-Complex File Format A Word binary file (non-complex format) consists of the Word file header (FIB), the text, and the formatting information. FIB Stored at beginning of page 0 of the file. fib.fComplex will be set to zero. text of body, footnotes, headers Text begins at the position recorded in fib.fcMin. group of SEPXs SEPXs immediately follow the text and are concatenated one after the other. A SEPX may not span a 512- byte page boundary. If a SEPX will not fit in the space that remains in a page from recording previous text or SEPXs, space is skipped to allow the SEPX to start on a page boundary. A SEPX is guaranteed to be less than 512 bytes in length. If all sections in the document have default properties, no SEPXs would be stored. pictures Word picture structures immediately follow the preceding text/SEPXs and are concatenated one after the other if the document contains pictures. embedded objects-native data Word embedded object structures immediately follow the preceding text/SEPXs/picture and are concatenated one after the other if the document contains embedded objects. FKPs for CHPs The first CHP FKP begins at the first 512-byte boundary after the last byte of text/SEP/picture/embedded objects written. The remaining CHP FKPs are recorded in the 512-byte pages that immediately follow. FKPs for PAPs The first PAP FKP is written in the 512-byte page that immediately follows the page used to record the last CHP FKP. The remaining PAP FKPs are recorded in the 512-byte pages that follow. stsh (style sheet) The style sheet is written at the beginning of the 512-byte page that immediately follows the last PAP FKP. This is recorded in all Word for Windows documents. plcffndRef (footnote reference position table) Written immediately after the stsh if the document contains footnotes. plcffndTxt (footnote text position table) Written immediately after the plcffndRef.if the document contains footnotes. plcfandRef (annotation reference position table) Written immediately after the plcffndTxt if the document contains annotations. plcfandTxt (annotation text position table) Written immediately after the plcfandRef.if the document contains footnotes. plcfsed (section table) Written immediately after the plcfsed, if paragraph heights have been recorded. plcfpgd (page table) Written immediately after the previously recorded table, if page boundary information is recorded. sttbGlsy (glossary name string table) Written immediately after the previously recorded table, if the document stored is a glossary. plcfglsy (glossary entry text position table) Written immediately after the sttbGlsy, if the document stored is a glossary. plcfhdd (header text position table) Written immediately after the previously recorded table, if the document contains headers or footers. plcfbteChpx (bin table for CHP FKPs) Written immediately after the previously recorded table. This is recorded in all Word for Windows documents. plcfbtePapx (bin table for PAP FKPs) Written immediately after the plcfbteChpx. This is recorded in all Word for Windows documents. sttbfFn (table of font name strings) Written immediately after the plcfbtePapx. This is recorded in all Word for Windows documents. The names of the fonts correspond to the FTC codes in the CHP structure. For example, the first font name listed corresponds is the name for ftc = 01. plcffldMom(table of field positions and statuses for main document) Written immediately after the sttbfFn if the main document contains fields. plcffldHdr(table of field positions and statuses for header subdocument) Written immediately after the previously recorded table, if the header subdocument contains fields. plcffldFtn(table of field positions and statuses for footnote subdocument) Written immediately after the previously recorded table, if the footnote subdocument contains fields. plcffldAtn(table of field positions and statuses for annotation subdocument) Written immediately after the previously recorded table, if the annotation subdocument contains fields. plcffldMcr(table of field positions and statuses for macro subdocument) Written immediately after the previously recorded table, if the macro subdocument contains fields. sttbfBkmk(table of bookmark name strings) Written immediately after the previously recorded table, if the document contains bookmarks. plcfBkmkf(table recording beginning CPs of bookmarks) Written immediately after the sttbfBkmk, if the document contains bookmarks. plcfBkmkl(table recording limit CPs of bookmarks) Written immediately after the plcfBkmkf, if the document contains bookmarks. cmds (recording of command data structures) Written immediately after the previously recorded table, if special commands are linked to this document. plcfmcr (macro text position table -- delimits boundaries of text for macros stored in macro subdocument) Written immediately after the previously recorded table, if a macro subdocument is recorded. sttbfMcr (table of macro name strings) Written immediately after the plcfmcr, if a macro subdocument is recorded. PrEnv (data structures recording the print environment for document) Written immediately after the previously recorded table, if a print environment is recorded for the document. wss (window state structure) Written immediately after the end of previously recorded structure, if the document was saved while a window was open. 1In the Winword 1.x format, the names of the first three fonts were omitted from the table and assumed to be "Tms Rmn" (for ftc = 0), "Symbol", and "Helv". In WinWord 2.0, the names for all fonts are included explitly in the table. It is still true that ftc = 0 represents the "best" Roman PS font on the system, ftc = 1 represents the Symbol font, and ftc = 2 Windows documents. sttbfAssoc(table of associated strings) Autosave source(name of original) Written immediately after the sttbfAssoc table. This field only appears in autosave files. These files are normal Word for Windows document in every other way. Also, autosaved files are typically in the complex file format except that we don't overwrite the tables (PLCF*, etc.). I.e., an autosaved file is typically longer than the equivalent Word for Windows document. Complex File Format A Word binary file (complex format) consists of the Word file header (FIB), the text, and the formatting information. FIB Text of body, footnotes, headers stored during last full save Text begins at the position recorded in fib.fcMin. Group of SEPXs stored during last full save Pictures stored during last full save Embedded Object stored during last full save FKPs for CHPs during last full save The first CHP FKP begins at the first 512-byte boundary after the last byte of text/SEP/picture/embedded object written. The remaining CHP FKPs are recorded in the 512-byte pages that immediately follow. FKPs for PAPs during last full save The first PAP FKP is written in the 512-byte page that immediately follows the page used to record the last CHP FKP. The remaining PAP FKPs are recorded in the 512-byte pages that follow. STSH (if style sheet has not grown since last full save) Any text, SEPXs, pictures or embedded objects stored during first fast save Any CHP FKPs stored during first full save Any PAP FKPs stored during first full save Any text, SEPXs, pictures or embedded objects stored during second fast save Any CHP FKPs stored during second full save Any PAP FKPs stored during second full save ... Any text, SEPXs, pictures or embedded objects stored during nth fast save Any CHP FKPs stored during nth full save Any PAP FKPs stored during nth full save stsh (if style sheet has grown since last full save) plcffndRef (footnote reference position table) Written immediately after the stsh if the document contains footnotes. plcffndTxt (footnote text position table) Written immediately after the plcffndRef.if the document contains footnotes. plcfandRef (annotation reference position table) Written immediately after the plcffndTxt if the document contains annotations. plcfandTxt (annotation text position table) Written immediately after the plcfandRef.if the document contains footnotes. plcfsed (section table) Written immediately after the previously recorded table. Recorded in all Word for Windows documents. plcfpgd (page table) Written immediately after the previously recorded table, if page boundary information is recorded. sttbGlsy (glossary name string table) Written immediately after the previously recorded table, if the document stored is a glossary. plcfglsy (glossary entry text position table) Written immediately after the sttbGlsy, if the document stored is a glossary. plcfhdd (header text position table) Written immediately after the previously recorded table, if the document contains headers or footers. plcfbteChpx (bin table for CHP FKPs) Written immediately after the previously recorded table. This is recorded in all Word for Windows documents. plcfbtePapx (bin table for PAP FKPs) Written immediately after the plcfbteChpx. This is recorded in all Word for Windows documents. sttbfFn (table of font name strings) Written immediately after the plcfbtePapx. This is recorded in all Word for Windows documents. The names of the fonts correspond to the FTC codes in the CHP structure. For example, the first font name listed corresponds is the name for ftc = 01 . plcffldMom(table of field positions and statuses for main document) Written immediately after the sttbfFn if the main document contains fields. plcffldHdr(table of field positions and statuses for header subdocument) Written immediately after the previously recorded table, if the header subdocument contains fields. plcffldFtn(table of field positions and statuses for footnote subdocument) Written immediately after the previously recorded table, if the footnote subdocument contains fields. plcffldAtn(table of field positions and statuses for annotation subdocument) Written immediately after the previously recorded table, if the annotation subdocument contains fields. plcffldMcr(table of field positions and statuses for macro subdocument) Written immediately after the previously recorded table, if the macro subdocument contains fields. sttbfBkmk(table of bookmark name strings) Written immediately after the previously recorded table, if the document contains bookmarks. plcfBkmkf(table recording beginning CPs of bookmarks) Written immediately after the sttbfBkmk, if the document contains bookmarks. plcfBkmkl(table recording limit CPs of bookmarks) Written immediately after the plcfBkmkf, if the document contains bookmarks. cmds (recording of command data structures) Written immediately after the previously recorded table, if special commands are linked to this document. plcfmcr (macro text position table -- delimits boundaries of text for macros stored in macro subdocument) Written immediately after the previously recorded table, if a macro subdocument is recorded. sttbfMcr (table of macro name strings) Written immediately after the plcfmcr, if a macro subdocument is recorded. PrEnv (data structures recording the print environment for document) Written immediately after the previously recorded table, if a print environment is recorded for the document. wss (window state structure) Written immediately after the end of previously recorded structure, if the document was saved while a window was open. 1 In the Winword 1.x format, the names of the first three fonts were omitted from the table and assumed to be "Tms Rmn" (for ftc = 0), "Symbol", and "Helv". In WinWord 2.0, the names for all fonts are included explitly in the table. It is still true that ftc = 0 represents the "best" Roman PS font on the system, ftc = 1 represents the Symbol font, and ftc = 2 represents the "best" Swiss (Sans Serif) PS font available. for Windows documents. dop (document properties record) Written immediately after the end of previously recorded structure.. This is recorded in all Word for Windows documents. sttbfAssoc(table of associated strings) Autosave source (documented above) File Information Block (FIB) The FIB contains a "magic word" and pointers to the various other parts of the file, as well as information about the length of the file. The FIB starts at the beginning of the file and fits within the first page of the file. The FIB is defined in the structure definition section of this document. Text The text of the file starts at fib.fcMin. fib.fcMin is usually set to the next 128 byte boundary after the end of the FIB. The text in a Word document is ASCII text with the following restrictions (ASCII codes given in decimal): - Paragraph ends (or line ends in unformatted files) are stored as <Carriage Return, Line Feed> (ASCII 13, ASCII 10). No other occurrences of this character sequence are allowed. - Hard line breaks which are not paragraph ends are stored as ASCII 11. Other line break or word wrap information is not stored. - Breaking hyphens are stored as ASCII 45 (normal hyphen code); Non-required hyphens are ASCII 31. Non-breaking hyphens are stored as ASCII 30. - Non-breaking spaces are stored as 160. Normal spaces are ASCII 32. - Page breaks and Section marks are ASCII 12 (normal form feed); if there's an entry in the section table, it's a section mark, otherwise it's a page break. - Column breaks are stored as ASCII 14. - Tab characters are ASCII 9 (normal). - The field begin mark which delimits the beginning of a field is ASCII 19. The field end mark which delimits the end of a field is ASCII 21. The field separator ,which marks the boundary between the preceding field code text and following field expansion text within a field, is ASCII 20. The field escape character is the '\' character which also serves as the formula mark. - The cell mark which delimits the end of a cell in a table row is stored as ASCII 7 and has the fInTable paragraph property set to fTrue (pap.fInTable == 1). - The row mark which delimits the end of a table row is stored as ASCII 7 and has the fInTable paragraph property and fTtp paragraph property set to fTrue (pap.fInTable == 1 && pap.fTtp == 1). The following ASCII codes are treated as "special" characters when they have the character property special on (chp.fSpec == 1): 1 Picture 2 Autonumbered footnote reference. 3 Footnote separator character 4 Footnote continuation character 5 Annotation reference 6 Hand Annotation (Generated in Pen Windows) Note: The end of a section is also the end of a paragraph. The last character of a section is a section mark which stands in place of the paragraph mark normally required to end a paragraph. An exception is made for the last character of a If !fib.fComplex, the document text stream is represented by the text beginning at fib.fcMin up to (but not including) fib.fcMac. Otherwise, the document is represented by the piece table stored in the file in the data beginning at .fib.fcClx. The document text stream includes text that is part of the main document, plus any text that exists for the footnote, header, macro, or annotation subdocuments. The sizes of the main document and the header, footnote, macro and annotation subdocuments are stored in the fib, in variables fib.ccpText, fib.ccpFtn, fib.ccpHdr, fib.ccpMcr, and fib.ccpAtn respectively. In a non-complex file, this means that the text of the main document begins at fib.fcMin in the file and continues through fib.fcMin + fib.ccpText; that the text of the footnote subdocument begins at fib.fcMin + fib.ccpText and extends to fib.fcMin + fib.ccpText + fib.ccpFtn; that the text of the header subdocument begins at fib.fcMin + fib.ccpText + fib.ccpFtn and extends to fib.fcMin + fib.ccpText + fib.ccpFtn + fib.ccpHdr; that the text of the macro subdocument begins at .fib.fcMin + fib.ccpText + fib.ccpFtn + fib.ccpHdr and extends to fib.fcMin + fib.ccpText + fib.ccpFtn + fib.ccpHdr + ccpMcr; and that the text of the annotation subdocument begins at .fib.fcMin + fib.ccpText + fib.ccpFtn + fib.ccpHdr + ccpMcr and extends to fib.fcMin + fib.ccpText + fib.ccpFtn + fib.ccpHdr + ccpMcr + ccpAtn. In a complex, fast-saved file, the main document text must be located by examining the piece table entries from the 0th piece table entry through the piece table entry that describes cp = fib.ccpText. A footnote subdocument's text must be located by examining the piece table entries beginning with the one that describes cp = fib.ccpText through the entry that describes cp = fib.ccpText + fib.ccpFtn. A header subdocument's text must be located by examining the piece table entries beginning with the one that describes cp = fib.ccpText + ccpFtn through the entry that describes cp = fib.ccpText + fib.ccpFtn + fib.ccpHdr. A macro subdocument's text must be located by examining the piece table entries beginning with the one that describes cp = fib.ccpText + ccpFtn + fib.ccpHdr through the entry that describes cp = fib.ccpText + fib.ccPFtn + fib.ccpHdr + fib.ccpMcr. An annotation subdocument's text must be located by examining the piece table entries beginning with the one that describes cp = fib.ccpText + ccpFtn + fib.ccpHdr + fib.ccpMcr through the entry that describes cp = fib.ccpText + fib.ccPFtn + fib.ccpHdr + fib.ccpMcr + ccpAtn. Character and Paragraph Formatting Properties Character and paragraph properties in Word documents are stored in a compressed format. The information that is stored on disk is not the actual properties of a particular sequence of text but the difference of the properties of a sequence from some reference property. The PAP is a data structure that holds uncompressed paragraph property information; the CHP (pronounced like "chip") is a structure that holds uncompressed character property information .Each paragraph in a Word document inherits a default set of paragraph and character properties from one of the styles recorded in the style sheet data structure (STSH). A particular PAP is converted into its compressed form, the PAPX, by first comparing the pap for a paragraph with the pap stored in the style sheet for the paragraph's style. Any properties in the paragraph's PAP that are different from those stored in the style sheet PAP are encoded as a list of sprms (grpprl). sprms express how the content of the style sheet PAP should be transformed to create the properties for the paragraph. A PAPX is a variable-length data structure that begins with a count of words that encodes the PAPX length. It contains a STC (style code) which specifies which style entry in the style sheet contains the default paragraph and character properties for the paragraph, paragraph height information, and the list of difference sprms. If the only difference between the paragraph's PAP and the style's PAP were in the justification code field, which is one byte long, one two-byte sprm, sprmPJc, would be generated to express that difference; thus the total PAPX size would be 9 bytes. This is better than 23-1 compression since the total size of a PAP is 210 bytes. To convert a CHP for a sequence of characters contained within a single paragraph into its compressed form, the CHPX, first clear a local instance of a CHP to zeros. (A CHPX has the same form as a CHP, but is interpreted with a different algorithm.) This local instance will be transformed into the CHPX. In the CHP are a set of properties encoded as single bits (e.g. fBold, fItalic), and another set of properties encoded as multi-bit, byte or word fields (FTC, HPS).When there is a difference in one of the single-bit properties between the character sequence property and the style property, the bit for that property is set to 1 in the local instance. Any single-bit properties that are the same in the two versions will be left 0 in the local instance. The idea is that when a CHPX is interpreted, all of the single-bit properties in the CHPX will be xor-ed with the single-bit properties in the style's character properties. This will produce a CHP that has the same single bit settings as the character property of the character sequence. Each of the multi-bit, byte, and word fields in a CHP are assigned a bit in the first 16 bits of the CHPX which, when true, will mean that there is a difference in the corresponding non-bit field. For example the font code field in the CHP (chp.ftc) is assigned a bit in the first 16 bits called chp.fsFtc. When a difference is detected in a non-bit field, the difference bit in the first 16 bits of the CHP that corresponds to that property is set to 1, and the contents of the field in character sequence's CHP is copied to the local instance. If one of the non-bit fields is unchanged from its setting in the style's CHP the equivalent value stored in the CHPX will be 0. Since only the non-zero prefix of a CHPX is recorded in Word files, fairly good compression can be achieved. For example, if the character sequence CHP was changing the boldness of the style character property and was changing the font code to 0, a difference bit for boldness would be recorded in the 0th byte of the CHPX and a difference bit for the font code would be set in the 1st byte of the CHPX. In this case the CHPX would consist of a 1-byte length code plus a 2-byte non-zero prefix. This would be 4-1 compression. If a sequence of characters has the same character properties and the sequence spans more than one paragraph, it's necessary to examine each paragraph's properties and to generate a different CHPX every time there is a change of style. In Word documents, the fundamental unit of text for which character exception information is kept is the run of exception text, a contiguous sequence of characters stored on disk that all have the same exception properties with respect to their underlying style character properties. Each run would have an entry recorded in a CHPX FKP. If a user never changed the character properties inherited from the styles used in his document and did a complete save of his document, although each of those styles may have different properties, the entire document stream would be one large run of exception text and one CHPX would suffice to describe the character properties of the entire document. The fundamental unit of text for which paragraph properties are recorded is the paragraph. Every paragraph has an entry recorded in a PAPX FKP. The CHPX FKP and the PAPX FKP have the same physical structure. An FKP is a 512-byte data structure that is stored in one page of a Word file. At offset 511 is a 1-byte count named crun, which is a count of runs of exception text for CHPX FKPs and which is a count of paragraphs in PAPX FKPs. Beginning at offset 0 of the FKP is an array of crun + 1 FCs, named rgfc, which records the beginning and limit FCs of crun runs of exception text or paragraphs. Immediately following rgfc is a byte array of crun word offsets to CHPXs or PAPXs from the beginning of the FKP. This byte array, named rgb, is in 1-to-1 correspondence with the rgfc. The ith rgb gives the word offset of the exception property that belongs to the run/paragraph whose beginning in FC space is rgfc[i] and whose limit is rgfc[i+1] in FC space. The fact that the value stored in rgb is a word offset implies that CHPXs and PAPXs are stored in FKPs beginning on word boundaries. Since the values stored in the rgb allow random access throughout the FKP, space within an FKP can be conserved by storing the offset of the same physical CHPX/PAPX in rgb entries when several runs or paragraphs in the FKP have the same properties. Word uses this optimization. An rgb value of 0 is used in another optimization. When a rgb value of 0 is stored in an FKP, it means that instead of referring to a particular CHPX/PAPX in the FKP the 0 value is a signal that the reader should construct for itself a commonly encountered predefined set of properties. pixels, with a column width of 7980 dxas. When new entries are added to an FKP, there must be unallocated space in the middle of the FKP equal to 5 bytes(size of an FC plus size of one-byte word offset), plus the size of the new CHPX or PAPX if the property being added is not already recorded in the FKP and is not the property coded with a 0 rgb value. To add a new property, existing rgb entries are moved four bytes to the right in the FKP. The new FC is added at the end of the rgfc. The new CHPX or PAPX is recorded on a 2-byte boundary before the previously recorded properties stored at the end of the block. The word offset of the beginning of the CHPX or PAPX is stored as the last entry of the relocated rgb, and finally, the crun stored at offset 511 is incremented. Bin Tables A bin table (plcfbte) partitions the total extent of the Word file that contains text characters into a set of contiguous intervals marked by a fcFirst and an fcLim. The fcFirst for the nth interval would be plcfbte.rgfc[n] and the fcLim for the nth interval would be plcfbte.rgfc[n+1]. Associated with each interval is a BTE. A BTE holds a two-byte PN (page number) which identifies the FKP page in the file which contains the formatting information for that interval. A CHPX FKP further partitions an interval into runs of exception text. A PAPX FKP in a non-complex, full-saved file, partitions the text within intervals into paragraphs. If a file is in complex format (has been fast-saved), the PAPX FKP only records the FCs within the text that are preceded by a paragraph mark. Even though a sequence of text may be physically located between two paragraph end marks, it may reside in a paragraph different from the one defined by the following paragraph end mark, because the text may have been moved by the user into a different paragraph. In the logical text stream represented by the document's piece table, the paragraph mark that follows the moved text is stored in a non- adjacent physical location in the file. Style Sheets The style sheet establishes a correspondence between a style code (a number in the range 0-255) and a style, a set of paragraph and character formatting information. When a particular style code is recorded in the pap.stc, the paragraph properties corresponding to the style code are used as the default paragraph properties for the paragraph and the character properties corresponding to the style code are used as the default character properties for all paragraph characters. Any property differences from these "defaults" are recorded in the CHPXs and PAPXs discussed previously. A style is always based on another style, with the differences being stored as CHPXs and PAPXs in the style sheet. There can be a chain of "based on" styles up to 10 deep; that is, a style can have up to 9 "ancestors" (not including itself). Eventually, every chain must end eventually at the null style (style code 222). The null style, not based on any other style, has the standard character and paragraph properties. The standard CHP has chp.hps = 20 with all other fields set to zero. The standard PAP has all fields set to zero. To arrive at the final CHP and PAP of a paragraph, the CHPXs and PAPXs in the based-on chain must all be applied, starting at the null style. The style sheet is stored in the file in the following format: Field Size Comment cstcStd 2 bytes count of standard STCs used in this document sttbName: cbName 2 bytes count of bytes in sttbName (including cb) grpst cbName - 2 group of style names, stored as st's (string preceded by length byte) sttbChpx: character properties of the base style whose style code is recorded in the plestcp sttbPapx: cbPapx 2 bytes cb of sttbPapx (inclusive) grpst cbPapx - 2 group of PAPXs, where PAPXs are modifications from the paragraph properties of the base style whose style code is recorded in the plfestcp plfestcp: iMac 2 bytes count of entries in dnstcp dnstcp 2 * iMac bytes array of estcp's (see below) The three sttbs and the plestcp are parallel structures, indexed by stcps. Using stcps instead of STCs to index the sttbs minimizes the amount of space taken by undefined styles in the style sheet . A "defined" style is one whose sttbName entry is not 255 (see below). A style needs to be defined if it is either referenced explicitly in the document or if a style which is referenced is based on it. The stcps are derived as follows: The styles must be stored sequentially in the style sheet. Style codes in the range 1-221 are reserved for user-defined styles (not all of which are necessarily defined for a particular document) and those in the range 222-255 and 0 are reserved for standard styles (again, not all of which are necessarily defined). If we start with the minimum defined standard style, list all the rest of the standard styles, followed by the user-defined ones, up to the maximum one that is defined, we have a sequential list which is of minimal size that encompasses all of the defined styles. The stcps are simply the indices of this transformed list of STCs. There is a 1-to-1 mapping from STCs to stcps and back: stcp = (stc + cstcStd) & 255 stc = (stcp - cstcStd) & 255 The dnstcp is an array which can be indexed by the stcp derived from the stc. The grpst's in the sttbs are constructed from a number of variable length st's (or CHPXs or PAPXs) run together. Since the length of an st is encoded right in the st, the nth st can be extracted from a grpst by scanning sequentially from the beginning of the grpst. An estcp is a 2-byte structure composed of two 1-byte STCs: stcNext When the user breaks a paragraph having the current style by inserting a paragraph mark, the newly created paragraph will be given the default properties of style stcNext (must be a defined style); the default stcNext for an stc is itself stcBase the style that this style is based on (must be a defined style) The st's in the grpst's above can have special meanings. In sttbName, if an st is 255 the style is not defined (in which case the CHPX and PAPX should also be 255). In the sttbName, if the length byte of the st is 0, it is a standard style which has the internal built-in name (no alternates added). If there are any alternate names for a style, they are appended together and separated by commas. The internal names for standard styles may not be changed (though alternates may be appended) or used as alternates for any other style. They are eternally wed to the standard style. If the length byte of the st is 0 in the sttbChpx or the sttbPapx, there are no differences from the properties of the style it is based on. If the length byte of the name is not 255, but the length byte of the CHPX or PAPX is 255 (if either one is, the other one should also be), then it is a standard style whose character or paragraph properties (respectively) have not changed from the internal built-in styles (described below). For user-defined styles, the name must be at least one non- space character. STCs 222 through 255 and 0 are reserved for "standard" styles, e.g. styles for headers, footers, page numbers, etc. (this leaves 1-221 for user-defined styles). The "normal" style is the style used as a default for all paragraphs in a document that don't have a style. The default (built-in) property exceptions (differences from "based-on" style) for standard styles 0 Normal standard PAP(standard PAP has all fields cleared to 0), standard CHP ( chp.hps = 20, all other fields set to 0). 255 Normal indent pap.dxaLeft = 720. /* Heading levels */ 254 heading 1 pap.dyaBefore = 240 (12 points), chp.fBold = negation of Normal style's chp.fBold, chp.kul = 1 (single underline), chp.hps = 24, chp.ftc = 2 . 253 heading 2 pap.dyaBefore = 120 (6 points), chp.fBold = negation of Normal style's chp.fBold, chp.hps = 24, chp.ftc = 2 252 heading 3 pap.dxaLeft = 360, chp.fBold = negation of Normal style's chp.fBold, chp.hps = 24; 251 heading 4 pap.dxaLeft = 360, chp.kul = 1 (single underline), chp.hps = 24; 250 heading 5 pap.dxaLeft = 720, chp.fBold = negation of Normal style's chp.fBold, chp.hps = 20; 249 heading 6 pap.dxaLeft = 720, chp.kul = 1 (single underline), chp.hps = 20; 248 heading 7 pap.dxaLeft = 720, chp.fItalic = negation of Normal style's chp.fItalic, chp.hps = 20; 247 heading 8 pap.dxaLeft = 720, chp.fItalic = negation of Normal style's chp.fItalic, chp.hps = 20; 246 heading 9 pap.dxaLeft = 720, chp.fItalic = negation of Normal style's chp.fItalic, chp.hps = 20; 245 footnote text chp.hps = 20 244 footnote reference chp.hps = 16; hpsPos = 6 243 header When running a U.S. system file: pap.itbdMac = 2, pap.rgdxaTab[0] = 3 * 1440, pap.rgtbd[0].jc = 1, pap.rgtbd[0].tlc = 0, pap.rgdxaTab[1] = 6* 1440, pap.rgtbd[1].jc = 1, pap.rgtbd[1].tlc = 0; When running an International metric system: pap.itbdMac = 2, pap.rgdxaTab[0] =3969, pap.rgtbd[0].jc = 1, pap.rgtbd[0].tlc = 0, pap.rgdxaTab[1] = 8504, pap.rgtbd[1].jc = 1, pap.rgtbd[1].tlc = 0; 242 footer When running a U.S. system file: pap.itbdMac = 2, pap.rgdxaTab[0] = 3 * 1440, pap.rgtbd[0].jc = 1, pap.rgtbd[0].tlc = 0, pap.rgdxaTab[1] = 6* 1440, pap.rgtbd[1].jc = 1, pap.rgtbd[1].tlc = 0; When running an International metric system: pap.itbdMac = 2, pap.rgdxaTab[0] =3969, pap.rgtbd[0].jc = 1, pap.rgtbd[0].tlc = 0, pap.rgdxaTab[1] = 8504, pap.rgtbd[1].jc = 1, pap.rgtbd[1].tlc = 0; 241 index heading same as properties for Normal style (stc == 0) 240 line number same as properties for Normal style (stc == 0) /* Index entries: */ When running on U.S. system file all have pap.dxaLeft = (index level number- 1) *360 When running on an International metric system all have pap.dxaLeft = (index level number- 1) *283 239 index 1 238 index 2 237 index 3 233 index 7 /* Table of Contents entries: */ When running on U.S. system file pap.itbdMac = 2, pap.rgdxaTab[0] = 8280, pap.rgtbd[0].jc = 0, pap.rgtbd[0].tlc = 1, pap.rgdxaTab[1] = 8640, pap.rgtbd[1].jc = 2, pap.rgtbd[1].tlc = 0; pap.dxaRight =720 pap.dxaLeft = (table of contents level number - 1) * 720 When running an International metric system: pap.itbdMac = 1, pap.rgdxaTab[0] = 8280, pap.rgtbd[0].jc = 2, pap.rgtbd[0].tlc = 1, pap.dxaRight =850 pap.dxaLeft = (table of contents level number - 1) * 2835 232 toc 1 231 toc 2 230 toc 3 229 toc 4 228 toc 5 227 toc 6 226 toc 7 225 toc 8 224 annotation text chp.hps = 20 223 annotation reference chp.hps = 16 223 /* reserved */ 222 Null stc (no name) all pap fields = 0, standard character props (chp.ftc = 2, chp.hps = 24); Even if a document has no style sheet, the minimal STSH that must be written to the file takes up 17 bytes and is: cstcStd = 0 /* one entry in each of the sttbs: */ sttbName: cb = 3 (includes itself and the "\0") 0 (empty string) sttbChpx: cb = 8 CHPX: cb = 5 /* all zeros except: */ fsHps = True hps = 20 (decimal) sttbPapx: cb = 6 PAPX: cb = 3 stc = 0 paph = 0 /* one entry in the plfestcp: */ iMac = 1 SPRM Definitions A sprm is an instruction to modify one or more properties within one of the property defining data structures (CHP, PAP, TAP, SEP, or PIC). A sprm always begins with a one byte opcode at offset 0 which identifies the operation to be performed. If necessary information for the operation can always be expressed with a fixed length parameter, the fixed length parameter is recorded immediately after the opcode beginning at offset 1. The length of a fixed length sprm is always 1 plus the size of the sprms parameter. If the parameter for the sprm is variable length, the count of bytes of the following parameter is stored in the byte at offset 1. Two sprms, sprmPChgTabs and sprmTDefTable, can be longer than 256 bytes. The method for calculating the length of sprmPChgTabs is recorded below with the description of the sprm. For sprmTDefTable, the length of the parameter plus 1 is recorded in the two bytes beginning at offset 1. For variable length sprms, the total length of the sprm is the count recorded at offset 1 plus two. The parameter immediately follows the count. Unless otherwise noted, when a sprm is applied to a property the sprms parameter changes the old value of the property in question to the value stored in the sprm parameter. Name op code Property Modified Parameter Parameter size sprmPStc 2 pap.stc stc (style code) byte sprmPStcPermute 3 pap.stc permutation vector variable length (see below) sprmPIncLevel 4 pap.stc difference between byte stc of base PAP and stc of PAP to be produced (see below) sprmPJc 5 pap.jc jc (justification) byte sprmPFSideBySide 6 pap.fSideBySide 0 or 1 byte sprmPFKeep 7 pap.fKeep 0 or 1 byte sprmPFKeepFollow 8 pap.fKeepFollow 0 or 1 byte sprmPPageBreakBefore 9 pap.fPageBreakBefore 0 or 1 byte sprmPBrcl 10 pap.brcl brcl byte sprmPBrcp 11 pap.brcp brcp byte sprmPNfcSeqNumb 12 pap.nfcSeqNumb nfc byte sprmPNoSeqNumb 13 pap.nnSeqNumb nn byte sprmPFNoLineNumb 14 pap.fNoLnn 0 or 1 byte sprmPChgTabsPapx 15 pap.itbdMac, pap.rgdxaTab, complex - see variable length pap.rgtbd below sprmPDxaRight 16 pap.dxaRight dxa word sprmPDxaLeft 17 pap.dxaLeft dxa word sprmPNest 18 pap.dxaLeft dxa-see below word sprmPDxaLeft1 19 pap.dxaLeft1 dxa word sprmPDyaLine 20 pap.dyaLine dya word sprmPDyaBefore 21 pap.dyaBefore dya word sprmPDyaAfter 22 pap.dyaAfter dya word sprmPChgTabs 23 pap.itbdMac, pap.rgdxaTab, complex - see variable length pap.rgtbd below sprmPFInTable 24 pap.fInTable 0 or 1 byte sprmPDyaAbs 27 pap.dyaAbs dya word sprmPDxaWidth 28 pap.dxaWidth dxa word sprmPPc 29 pap.pcHorz, pap.pcVert complex - see byte below sprmPBrcTop10 30 pap.brcTop BRC10 word sprmPBrcLeft10 31 pap.brcLeft BRC10 word sprmPBrcBottom10 32 pap.brcBottom BRC10 word sprmPBrcRight10 33 pap.brcRight BRC10 word sprmPBrcBetween10 34 pap.brcBetween BRC10 word sprmPBrcBar10 35 pap.brcBar BRC10 word sprmPFromText10 36 pap.dxaFromText dxa word sprmPBrcTop 38 pap.brcTop BRC word sprmPBrcLeft 39 pap.brcLeft BRC word sprmPBrcBottom 40 pap.brcBottom BRC word sprmPBrcRight 41 pap.brcRight BRC word sprmPBrcBetween 42 pap.brcBetween BRC word sprmPBrcBar 43 pap.brcBar BRC word sprmPWHeightAbs 45 pap.wHeightAbs w word sprmPShd 47 pap.shd SHD word sprmPDyaFromText 48 pap.dyaFromText dya word sprmPDxaFromText 49 pap.dxaFromText dxa word sprmPFBiDi 50 pap.fBiDi sprmCFStrikeRM 53 chp.fRMarkDel 1 or 0 bit sprmCFRMark 54 chp.fRMark 1 or 0 bit sprmCFFldVanish 55 chp.fFldVanish 1 or 0 bit sprmCDefault 57 whole CHP (see below) none variable length sprmCPlain 58 whole CHP (see below) none 0 sprmCFBold 60 chp.fBold 0,1, 128, or 129 byte (see below) sprmCFItalic 61 chp.fItalic 0,1, 128, or 129 byte (see below) sprmCFStrike 62 chp.fStrike 0,1, 128, or 129 byte (see below) sprmCFOutline 63 chp.fOutline 0,1, 128, or 129 byte (see below) sprmCFShadow 64 chp.fShadow 0,1, 128, or 129 byte (see below) sprmCFSmallCaps 65 chp.fSmallCaps 0,1, 128, or 129 byte (see below) sprmCFCaps 66 chp.fCaps 0,1, 128, or 129 byte (see below) sprmCFVanish 67 chp.fVanish 0,1, 128, or 129 byte (see below) sprmCFtc 68 chp.ftc ftc word sprmCKul 69 chp.kul kul byte sprmCSizePos 70 chp.hps, chp.hpsPos (see below) 3 bytes sprmCQpsSpace 71 chp.qpsSpace qps word sprmCLid 72 chp.lid LID word sprmCIco 73 chp.ico ico byte sprmCHpsPos 76 chp.hpsPos hps byte sprmCHpsPosAdj 77 chp.hpsPos hps (see below) byte sprmCMajority 78 whole CHP complex (see length byte below) plus 8 bytes sprmCFBoldBi 80 chp.fBoldBi 0, 1, 128 or 129 byte (see below) sprmCFItalicBi 81 chp.fItalicBi 0, 1, 128 or 129 byte (see below) sprmCFtcBi 82 chp.ftcBi ftcBi word sprmClidBi 83 chp.lidBi LID word sprmCIcoBi 84 chp.icoBi ico byte sprmCHpsBi 85 chp.hpsBi hps byte sprmCFBiDi 86 chp.fBiDi 0, 1, 128 or 129 byte (see below) sprmCFDiacColor 87 chp.fDiacUSico 0, 1, 128 or 129 byte (see below) sprmPicBrcl 94 pic.brcl brcl (see PIC byte structure definition) sprmPicScale 95 pic.mx, pic.my, complex (see length byte pic.dxaCropleft, below) plus 12 bytes pic.dyaCropTop pic.dxaCropRight, pic.dyaCropBottom sprmPicBrcTop 96 pic.brcTop BRC word sprmPicBrcLeft 97 pic.brcLeft BRC word sprmPicBrcBottom 98 pic.brcBottom BRC word sprmPicBrcRight 99 pic.brcRight BRC word sprmSFRTLGutter 112 sep.fRTLGutter 0, 1, 128 or 129 byte (see below) sprmSFBiDi 114 sep.fBiDi 0, 1, 128 or 129 byte (see below) sprmSDmBinFirst 115 sep.dmBinFirst word sprmSDmBinOther 116 sep.dmBinOther word sprmSBkc 117 sep.bkc bkc byte sprmSFTitlePage 118 sep.fTitlePage 0 or 1 byte sprmSCcolumns 119 sep.ccolM1 # of cols - 1 word sprmSDxaColumns 120 sep.dxaColumns dxa word sprmSFAutoPgn 121 sep.fAutoPgn obsolete byte sprmSNfcPgn 122 sep.nfcPgn nfc byte sprmSDyaPgn 123 sep.yaPage ya word sprmSDxaPgn 124 sep.xaPage xa word sprmSFPgnRestart 125 sep.fPgnRestart 0 or 1 byte sprmSFEndnote 126 sep.fEndnote 0 or 1 byte sprmSLnc 127 sep.lnc lnc byte sprmSGprfIhdt 128 sep.grpfIhdt grpfihdt (see byte Headers and Footers topic) sprmSNLnnMod 129 sep.nLnnMod non-neg int. word sprmSDxaLnn 130 sep.dxaLnn dxa word sprmSVjc 134 sep.vjc vjc byte sprmSLnnMin 135 sep.lnnMin lnn word sprmSPgnStart 136 sep.pgnStart pgn word sprmSBOrientation 137 sep.morPage mor (CHAR) byte sprmSFFacingCol 138 sep.fFacingCol 0, 1, 128 or 129 byte (see below) sprmSXaPage 139 sep.xaPage xa word sprmSYaPage 140 sep.yaPage ya word sprmSDxaLeft 141 sep.dxaLeft dxa word sprmSDxaRight 142 sep.dxaRight dxa word sprmSDyaTop 143 sep.dyaTop dya word sprmSDyaBottom 144 sep.dyaBottom dya word sprmSDzaGutter 145 sep.dzaGutter dza word sprmTJc 146 tap.jc jc word (low order byte is significant) sprmTDxaLeft 147 tap.rgdxaCenter (see below) dxa word sprmTDxaGapHalf 148 tap.dxaGapHalf, dxa word tap.rgdxaCenter (see below) sprmTFBiDi 149 tap.fBiDi 0, 1, 128 or 129 byte (see below) sprmTDefTable10 152 tap.rgdxaCenter, tap.rgtc complex (see variable length below) sprmTDyaRowHeight 153 tap.dyaRowHeight dya word sprmTDefTable 154 complex (see 0 below) sprmTDefTableShd 155 tap.rgshd complex (see 0 below) sprmTSetBrc 157 tap.rgtc[].rgbrc complex (see 5 bytes below) sprmTInsert 158 tap.rgdxaCenter,tap.rgtc complex (see 4 bytes below) sprmTDelete 159 tap.rgdxaCenter, tap.rgtc complex (see word below) sprmTDxaCol 160 tap.rgdxaCenter complex (see 4 bytes below) sprmTMerge 161 tap.fFirstMerged, tap.fMerged complex (see word below) sprmTSplit 162 tap.fFirstMerged, tap.fMerged complex (see word below) sprmTSetBrc10 163 tap.rgtc[].rgbrc complex (see 5 bytes below) sprmTSetShd 164 tap.rgshd complex (see 4 bytes below) sprmMax 165 The paragraph sprms used to encode paragraph properties in a PAPX are: sprmPJc, sprmPFSideBySide, sprmPFKeep, sprmPFKeepFollow, sprmPFPageBreakBefore, sprmPBrcp, sprmPPc, sprmPBrcl, sprmPFNoLineNumb, sprmPDxaRight, sprmPDxaLeft., sprmPDxaLeft1, sprmPDyaLine, sprmPDyaBefore, sprmPDyaAfter, sprmPFInTable, sprmPFTtp, sprmPDxaAbs, sprmPDyaAbs, sprmPDxaWidth, sprmPDxaWidth, sprmPBrcTop, sprmPBrcLeft, The table sprms used to encode table properties in a PAPX stored in a PAPX FKP are: sprmTJc, sprmTDxaGapHalf, sprmTDyaRowHeight, sprmTDefTableShd, and sprmTDefTable. The section sprms used to encode section properties in a SEPX are: sprmSBkc, sprmSFTitlePage, sprmSCcolumns, sprmSNfcPgn, sprmSPgnStart, sprmSFAutoPgn, sprmSDyaPgn, sprmSDxaPgn, sprmSFPgnRestart, sprmSFEndnote, sprmSLnc, sprmSGrpfIhdt, sprmSNLnnMod, sprmSDxaLnn, sprmSDyaHdrTop, sprmSDyaHdrBottom. sprmPStcPermute (opcode 3) is a complex sprm which is applied to a piece when the style codes of paragraphs within a piece must be mapped to other style codes. It has the following format: Field Size Comment sprm byte opcode( ==3) cch byte count of bytes (not including sprm and cch) mpstcFromstcTo byte permutation mapping from original stc values to new stc values To interpret sprmPStcPermute, first check if pap.stc is greater than 0 and less or equal to the cch stored in the sprm. If not, the sprm has no effect. If it is, pap.stc is set to mpstcFromstcTo[pap.stc - 1]. sprmPStcPermute is only stored in grpprls linked to a piece table. sprmPIncLvl (opcode 4) is applied to pieces in the piece table that contain paragraphs with style codes greater than or equal to 247 and less than or equal to 255. These style codes identify heading levels in a Word outline structure. The sprm causes a set of paragraphs to be changed to a new heading level. The sprm is two bytes long and consists of the sprm code and a one byte two’s complement value. If pap.stc is < 247, sprmPIncLvl has no effect. Otherwise, if the value stored in the byte has its highest order bit off, the value is a positive difference which should be subtracted from pap.stc and then pap.stc should be set to max(pap.stc, 247). If the byte value has its highest order bit on, the value is a negative difference which should be sign extended to a word and then subtracted from pap.stc. Then pap.stc should be set to min(255, pap.stc). sprmPIncLvl is only stored in grpprls linked to a piece table. The sprmPChgTabsPapx (opcode 15) is a complex sprm that describes changes in tab settings from the underlying style. It is only stored as part of PAPXs stored in FKPs and in the STSH. It has the following format: Field Size Comment sprm byte opcode cch byte count of bytes (not including sprm and cch) itbdDelMax byte number of tabs to delete rgdxaDel int[itbdDelMax] array of tab positions for which tabs should be deleted itbdAddMax byte number of tabs to add rgdxaAdd int[itbdAddMax] array of tab positions for which tabs should be added rgtbdAdd byte[itbdAddMax] array of tab descriptors corresponding to rgdxaAdd When sprmPChgTabsPapx is interpreted, the rgdxaDel of the sprm is applied first to the pap that is being transformed. This is done by deleting from the pap the rgdxaTab entry and rgtbd entry of any tab whose rgdxaTab value is equal to one of the rgdxaDel values in the sprm. It is guaranteed that the entries in pap.rgdxaTab and the sprms rgdxaDel and rgdxaAdd are recorded in ascending dxa order. Then the rgdxaAdd and rgtbdAdd entries are merged into the pap’s rgdxaTab and rgtbd arrays so that the resulting pap rgdxaTab is sorted in ascending order with no duplicates. sprmPNest (opcode 18) causes its operand, a two-byte dxa value to be added to pap.dxaLeft. If the result of the addition The sprmPChgTabs (opcode 23) is a complex sprm which describes changes tab settings for any paragraph within a piece. It is only stored as part of a grpprl linked to a piece table. It has the following format: Field Size Comment sprm byte opcode cch byte count of bytes (not including sprm and cch) itbdDelMax byte number of tabs to delete rgdxaDel int[itbdDelMax] array of tab positions for which tabs should be deleted rgdxaClose int[itbdDelMax] array of tolerances corresponding to rgdxaDel where each tolerance defines an interval around corresponding rgdxaDel entry within which all tabs should be removed itbdAddMax byte number of tabs to add rgdxaAdd int[itbdAddMax] array of tab positions for which tabs should be added rgtbdAdd byte[itbdAddMax] array of tab descriptors corresponding to rgdxaAdd itbdDelMax and itbdAddMax are defined to be equal to 50. This means that the largest possible instance of sprmPChgTabs is 354. When the length of the sprm is greater than or equal to 255, the cch field will be set equal to 255. When cch == 255, the actual length of the sprm can be calculated as follows: length = 2 + itbdDelMax * 4 + itbdAddMax * 3. When sprmPChgTabs is interpreted, the rgdxaDel of the sprm is applied first to the pap that is being transformed. This is done by deleting from the pap the rgdxaTab entry and rgtbd entry of any tab whose rgdxaTab value is within the interval [rgdxaDel[i] - rgdxaClose[i], rgdxaDel[i] + rgdxaClose[i]] It is guaranteed that the entries in pap.rgdxaTab and the sprms rgdxaDel and rgdxaAdd are recorded in ascending dxa order. Then the rgdxaAdd and rgtbdAdd entries are merged into the pap’s rgdxaTab and rgtbd arrays so that the resulting pap rgdxaTab is sorted in ascending order with no duplicates. The sprmPPc (opcode 29) is a complex sprm which describes changes in the pap.pcHorz and pap.pcVert. It is able to change both fields’ contents in parallel. It has the following format: Dec Hex field type size bitfield comments 0 0 sprm byte opcode 1 1 int :4 F0 reserved pcVert int :2 0C if pcVert ==3, pap.pcVert should not be changed. Otherwise, contains new value of pap.pcVert. pcHorz int :2 03 if pcHorz==3, pap.pcHorz should not be changed. Otherwise, contains new value of pap.pcHorz. Length of sprmPPc is two bytes. sprmPPc is interpreted by moving pcVert to pap.pcVert if pcVert != 3 and by moving pcHorz to pap.pcHorz if pcHorz != 3. sprmPPc is stored in PAPX FKPs and also in grpprls linked to piece table entries. sprmCDefault (opcode 57) clears the fBold, fItalic, fOutline, fStrike, fShadow, fSmallCaps, fCaps, fVanish, kul and ico fields of the chp to 0. It was first defined for Word 3.01 and had to be backward compatible with Word 3.00 so it is a variable length sprm whose count of bytes is 0. It consists of the sprmCDefault opcode followed by a byte of 0. sprmCDefault is stored only in grpprls linked to piece table entries. sprmCPlain (opcode 58) is used to make the character properties of runs of text equal to the style character properties of the paragraph that contains the text. When Word interprets this sprm, the style sheet CHP is copied over the original CHP preserving the fSpec setting from the original CHP. sprmCPlain is stored only in grpprls linked to piece table sprms 60 through 67 (sprmCFBold through sprmCFVanish) set single bit properties in the CHP. When the parameter of the sprm is set to 0 or 1, then the CHP property is set to the parameter value. When the parameter of the sprm is 128, then the CHP property is set to the value that is stored for the property in the style sheet. CHP When the parameter of the sprm is 129, the CHP property is set to the negation of the value that is stored for the property in the style sheet CHP. sprmCFBold through sprmCFVanish are stored only in grpprls linked to piece table entries. sprmCSizePos (opcode 70) is a four byte sprm consisting of the sprm opcode and a three byte parameter. The sprm has the following format: Dec Hex field type size bitfield comments 0 0 sprm byte opcode 1 1 hpsSize int :8 FF when != 0, contains new size of chp.hps 2 2 cInc int :7 FE contains the number of font levels to increase or decrease size of chp.hps as a twos complement value. fAdjust int :1 01 when == 1, means that chp.hps should be adjusted up/down by one font level for super/subscripting change 3 3 hpsPos int :8 FF when != 128, contains super/subscript position as a twos complement number When Word interprets this sprm, if hpsSize != 0 then chp.hps is set to hpsSize. If cInc is != 0, the cInc is interpreted as a 7 bit twos complement number and the procedure described below for interpreting sprmCHpsInc is followed to increase or decrease the chp.hps by the specified number of levels. If hpsPos is != 128, then chp.hpsPos is set equal to hpsPos. If fAdjust is on , hpsPos != 128 and hpsPos != 0 and the previous value of chp.hpsPos == 0, then chp.hps is reduced by one level following the method described for sprmCHpsInc. If fAdjust is on, hpsPos == 0 and the previous value of chp.hpsPos != 0, then the chp.hps value is increased by one level using the method described below for sprmCHpsInc. sprmCHpsInc(opcode 75) is a two-byte sprm consisting of the sprm opcode and a one-byte parameter. Word keeps an ordered array of the font sizes that are defined for the fonts recorded in the system file with each font size transformed into an hps. The parameter is a one-byte twos complement number. Word uses this number to calculate an index in the font size array to determine the new hps for a run. When Word interprets this sprm and the parameter is positive, it searches the array of font sizes to find the index of the smallest entry in the font size table that is greater than the current chp.hps. It then adds the parameter minus 1 to the index and maxes this with the index of the last array entry. It uses the result as an index into the font size array and assigns that entry of the array to chp.hps. When the parameter is negative, Word searches the array of font sizes to find the index of the entry that is less than or equal to the current chp.hps. It then adds the negative parameter to the index and does a min of the result with 0. The result of the min function is used as an index into the font size array and that entry of the array is assigned to chp.hps. sprmCHpsInc is stored only in grpprls linked to piece table entries. sprmCHpsPosAdj (opcode 77) causes the hps of a run to be reduced the first time text is superscripted or subscripted and causes the hps of a run to be increased when superscripting/subscripting is removed from a run. The one byte parameter of this sprm is the new hpsPos value that is to be stored in chp.hpsPos. If the new hpsPos is not equal 0 (meaning that the text is to be super/subscripted), Word first examines the current value of chp.hpsPos to see if it is equal to 0. If so, Word uses the algorithm described for sprmCHpsInc to decrease chp.hps by one level. If the new hpsPos == 0 (meaning the text is not super/subscripted), Word examines the current chp.hpsPos to see if it is not equal to 0. If it is not (which means text is being restored to normal position), Word uses the sprmCHpsInc algorithm to increase chp.hps by one level. After chp.hps is adjusted, the parameter value is stored in chp.hpsPos. sprmCHpsPosAdj is stored only in grpprls linked to piece table entries. The parameter of sprmCMajority (opcode 78) is the first 8 bytes of a CHP which encodes a criterion under which certain value as the field stored in the sprm, then that field is reset to the value stored in the style’s CHP. If the two copies differ, then the original CHP value is left unchanged. sprmCMajority is stored only in grpprls linked to piece table entries. sprmPicScale (opcode 95) is used to scale the x and y dimensions of a Word picture and to set the cropping for each side of the picture. The sprm begins with the one byte opcode, followed by the length of the parameter (always 12) stored in a byte. The 12-byte long operand consists of an array of 6 two-byte integer fields. The 0th integer contains the new setting for pic.mx. The 1st integer contains the new setting for pic.my. The 2nd integer contains the new setting for pic.dxaCropLeft. The 3rd integer contains the new setting for pic.dyaCropTop. The 4th integer contains the new setting for pic.dxaCropRight. The 5th integer contains the new setting of pic.dxaCropBottom. sprmPicScale is stored only in grpprls linked to piece table entries. sprmTDxaLeft (opcode 147) is called to adjust the x position within a column which marks the left boundary of text within the first cell of a table row. This sprm causes a whole table row to be shifted left or right within its column leaving the horizontal width and vertical height of cells in the row unchanged. Byte 0 of the sprm contains the opcode, and the new dxa position, call it dxaNew, is stored as an integer in bytes 1 and 2. Word interprets this sprm by adding dxaNew - (rgdxaCenter[0] + tap.dxaGapHalf) to every entry of tap.rgdxaCenter whose index is less than tap.itcMac. sprmTDxaLeft is stored only in grpprls linked to piece table entries. sprmTDxaGapHalf (opcode 148) adjusts the white space that is maintained between columns by changing tap.dxaGapHalf. Because we want the left boundary of text within the leftmost cell to be at the same location after the sprm is applied, Word also adjusts tap.rgdxCenter[0] by the amount that tap.dxaGapHalf changes. Byte 0 of the sprm contains the opcode, and the new dxaGapHalf, call it dxaGapHalfNew, is stored in bytes 1 and 2. When the sprm is interpreted, the change between the old and new dxaGapHalf values, tap.dxaGapHalf - dxaGapHalfNew, is added to tap.rgdxaCenter[0] and then dxaGapHalfNew is moved to tap.dxaGapHalf. sprmTDxaGapHalf is stored in PAPXs and also in grpprls linked to piece table entries. sprmTDefTable10 (opcode 152) is an obsolete version of sprmTDefTable (opcode 154) that was used in Word for Windows 1.x. Its contents are identical to those in sprmTDefTable, except that the TC structures contain the obsolete structures BRC10s. sprmTDefTable (opcode 154) defines the boundaries of table cells (tap.rgdxaCenter) and the properties of each cell in a table (tap.rgtc). The 0th byte of the sprm contains its opcode. Bytes 1 and 2 store a two-byte length of the following parameter. Byte 3 contains the number of cells that are to be defined by the sprm, call it itcMac. When the sprm is interpreted, itcMac is moved to tap.itcMac. itcMac cannot be larger than 32. In bytes 4 through 4+2*(itcMac + 1) -1 , is stored an array of integer dxa values sorted in ascending order which will be moved to tap.rgdxaCenter. In bytes 4+ 2*(itcMac + 1) through byte 4+2*(itcMac + 1) + 10*itcMac - 1 is stored an array of TC entries corresponding to the stored tap.rgdxaCenter. This array is moved to tap.rgtc. sprmTDefTable is only stored in PAPXs. sprmTDefTableShd (opcode 155) is similar to sprmTDefTable, and compliments it by defining the shading of each cell in a table (tap.rgshd). The 0th byte of the sprm contains its opcode. Bytes 1 and 2 store a two-byte length of the following parameter. Byte 3 contains the number of cells that are to be defined by the sprm, call it itcMac. itcMac cannot be larger than 32. In bytes 4 through 4+2*(itcMac + 1) -1 , is stored an array of SHDs. This array is moved to tap.rgshd. sprmTDefTable is only stored in PAPXs. sprmTInsert (opcode 158) inserts new cell definitions in an existing table’s cell structure. The 0th byte of the sprm contains the opcode Byte 1 is the index within tap.rgdxaCenter and tap.rgtc at which the new dxaCenter and tc values will be inserted. Call this index itcInsert. Byte 2 contains a count of the cell definitions to be added to the tap, call it ctc. Bytes 3 and 4 contain the width of the cells that will be added, call it dxaCol. If there are already cells defined at the index where cells are to be inserted, tap.rgdxaCenter entries at or above this index must be moved to the entry ctc higher and must be adjusted by adding ctc * dxaCol to the value stored. The contents of tap.rgtc at or above the index must be moved 10 * ctc bytes higher in tap.rgtc. If itcInsert is greater than the original tap.itcMac, itcInsert - tap.ctc columns beginning with index tap.itcMac must be added of width dxaCol (loop from itcMac to itcMac +itcInsert-tap.ctc adding were added to the tap is added to tap.itcMac. sprmTInsert is stored only in grpprls linked to piece table entries. sprmTDelete (opcode 159) deletes cell definitions from an existing table’s cell structure. The 0th byte of the sprm contains the opcode. Byte 1 contains the index of the first cell to delete, call it itcFirst. Byte 2 contains the index of the cell that follows the last cell to be deleted, call it itcLim. sprmTDelete causes any rgdxaCenter and rgtc entries whose index is greater than or equal to itcLim to be moved to the entry that is itcLim - itcFirst lower, and causes tap.itcMac to be decreased by the number of cells deleted. sprmTDelete is stored only in grpprls linked to piece table entries. sprmTDxaCol (opcode 160) changes the width of cells whose index is within a certain range to be a certain value. The 0th byte of the sprm contains the opcode. Byte 1 contains the index of the first cell whose width is to be changed, call it itcFirst. Byte 2 contains the index of the cell that follows the last cell whose width is to be changed, call it itcLim. Bytes 3 and 4 contain the new width of the cell, call it dxaCol. This sprm causes the itcLim - itcFirst entries of tap.rgdxaCenter to be adjusted so that tap.rgdxaCenter[i+1] = tap.rgdxaCenter[i] + dxaCol. Any tap.rgdxaCenter entries that exist beyond itcLim are adjusted to take into account the amount added to or removed from the previous columns. sprmTDxaCol is stored only in grpprls linked to piece table entries. sprmTMerge (opcode 161) merges the display areas of cells within a specified range. The 0th byte of the sprm contains the opcode. Byte 1 contains the index of the first cell that is to be merged, call it itcFirst. Byte 2 contains the index of the cell that follows the last cell to be merged, call it itcLim. This sprm causes tap.rgtc[itcFirst].fFirstMerged to be set to 1. Cells in the range whose index is greater than itcFirst and less than itcLim have tap.rgtc[].fMerged set to 1. sprmTMerge is stored only in grpprls linked to piece table entries. sprmTSplit (opcode 162) splits the display areas of merged cells into their originally assigned display areas. The 0th byte of the sprm contains the opcode. Byte 1 contains the index of the first cell that is to be split, call it itcFirst. Byte 2 contains the index of the cell that follows the last cell to be split, call it itcLim. This sprm clears tap.rgtc[].fFirstMerged and tap.rgtc[].fMerged for all rgtc entries >= itcFirst and < itcLim. sprmTSplit is stored only in grpprls linked to piece table entries. sprmTSetBrc (opcode 157) allows the border definitions(BRCs) within TCs to be set to new values. It has the following format: Dec Hex field type size bitfield comments 0 0 sprm byte opcode 157 1 1 itcFirst byte the index of the first cell that is to have its borders changed. 2 2 itcLim byte index of the cell that follows the last cell to have its borders changed 3 3 int :4 F0 reserved fChangeRight int :1 08 =1 when tap.rgtc[].brcRight is to be changed fChangeBottom int :1 04 =1 when tap.rgtc[].brcBottom is to be changed fChangeLeft int :1 02 =1 when tap.rgtc[].brcLeft is to be changed fChangeTop int :1 01 =1 when tap.rgtc[].brcTop is to be changed 4 4 brc BRC new BRC value to be stored in TCs. This sprm changes the brc fields selected by the fChange* flags in the sprm to the brc value stored in the sprm, for every tap.rgtc entry whose index is greater than or equal to itcFirst and less than itcLim.sprmTSetBrc is stored only in grpprls linked to piece table entries. 4 contain the SHD structure, call it shd. This sprm causes the itcLim - itcFirst entries of tap.rgshd to be set to shd. sprmTDxaCol is stored only in grpprls linked to piece table entries. Complex File Format The complex file format is used when a file is fast-saved. A complex file has fib.fComplex set to 1. In a complex file, fcClx is the fc where the complex part of the file begins, and cbClx is the size (in bytes) of the complex part. The complex part of the file contains a group of grpprls that encode formatting changes made by the user and a piece table (plcfpcd). The piece table is needed because the text of the document is not stored contiguously in the file after a fast save. The complex part of a file (CLX) is composed of a number of variable-sized blocks of data. Recorded first are any grpprls that may be referenced by the plcfpcd (if the plcfpcd has no grpprl references, no grpprls will be recorded) followed by the plcfpcd. Each block in the complex part is prefaced by a clxt (clx type), which is a 1-byte code, either 1 (meaning the block contains a grpprl) or 2 (meaning this is the plcfpcd). In both cases, the clxt is followed by a 2-byte cb which is the count of bytes of the grpprl or the piece table. So the formats of the two types of blocks are: clxt = 1 clxtGrpprl cb count of bytes in grpprl grpprl see "Definitions" for description of grpprl; a grpprl can contain sprms modifying character, paragraph, table, section or picture properties or clxt = 2 clxtPlcfpcd cb count of bytes in piece table plcfpcd piece table The entire CLX would look like this, depending on the number of grpprls: clxtGrpprl cb grpprl (0th grpprl) clxtGrpprl cb grpprl (1st grpprl) ... clxtPlcfpcd cb plcfpcd When the prm in PCDs stored in the plcfpcd, contains an igrpprl (index to a grpprl), the index stored is the order in which that grpprl was stored in the CLX. Algorithm to determine the bounds of a paragraph containing a certain character in a complex file When a document is recorded in non-complex format, the bounds of the paragraph that contains a particular character can be found by calculating the FC coordinate of the character, searching the bin table to find an FKP page that describes that FC, fetching that FKP, and then searching the FKP to find the interval in the rgfc that encloses the character. The When a document is recorded in complex format, a piece that was originally part of one paragraph can be copied or moved within a different paragraph. To find the beginning of the paragraph containing a character in a complex document, it’s first necessary to search for the piece containing the character in the piece table. Then calculate the FC in the file that stores the character from the piece table information. Using the FC, search the FCs FKP for the largest FC less than the character’s FC, call it fcTest. If the character at fcTest-1 is contained in the current piece, then the character corresponding to that FC in the piece is the first character of the paragraph. If that FC is before or marks the beginning of the piece, scan a piece at a time towards the beginning of the piece table until a piece is found that contains a paragraph mark. This can be done by using the end of the piece FC, finding the largest FC in its FKP that is less than or equal to the end of piece FC, and checking to see if the character in front of the FKP FC (which must mark a paragraph end) is within the piece. When such an FKP FC is found, the FC marks the first byte of paragraph text. To find the end of a paragraph for a character in a complex format file, again it is necessary to know the piece that contains the character and the FC assigned to the character. Using the FC of the character, first search the FKP that describes the character to find the smallest FC in the rgfc that is larger than the character FC. If the FC found in the FKP is less than or equal to the limit FC of the piece, the end of the paragraph that contains the character is at the FKP FC minus 1. If the FKP FC that was found was greater than the FC of the end of the piece, scan piece by piece toward the end of the document until a piece is found that contains a paragraph end mark. It’s possible to check if a piece contains a paragraph mark by using the FC of the beginning of the piece to search in the FKPs for the smallest FC in the FKP rgfc that is greater than the FC of the beginning of the piece. If the FC found is less than or equal to the limit FC of the piece, then the character that ends the paragraph is the character immediately before the FKP FC. A special procedure must be followed to locate the last paragraph of the main document text when footnote or header/footer text is saved in a Word file (i.e. when fib.ccpFtn != 0 or fib.ccpHdr != 0). In this case the CP of that paragraph mark is fib.ccpText + fib.ccpFtn + fib.ccpHdr + fib.ccpMcr + fib.ccpAtn and the limit CP of the entire plcfpcd is fib.ccpText + fib.ccpFtn + fib.ccpHdr + fib.ccpMcr + fib.ccpAtn + 1. Algorithm to determine paragraph properties for a paragraph in a complex file Having found the index i of the FC in an FKP that marks the character stored in the file immediately after the paragraph’s paragraph mark, it is necessary to use fkp.rgb[i - 1] to find the PAPX for the paragraph. Using papx.stc to index into the properties stored for the style sheet , the paragraph properties of the style are copied to a local PAP. Then the grpprl stored in the PAPX is applied to the local PAP, and papx.stc along with papx.phe are moved into the local PAP. The process thus far has created a PAP that describes what the paragraph properties of the paragraph were at the last full save. Now it’s necessary to apply any paragraph sprms that were linked to the piece that contains the paragraph’s paragraph mark. If pcd.prm.fComplex is 0, pcd.prm contains 1 sprm which should only be applied to the local PAP if it is a paragraph sprm. If pcd.prm.fComplex is 1, pcd.prm.igrpprl is the index of a grpprl in the CLX. If that grpprl contains any paragraph sprms, they should be applied to the local PAP. After applying all of the sprms for the piece, the local PAP contains the correct paragraph property values. Algorithm to determine table properties for a table row in a complex file To determine the table properties for a table row in a complex file, scan paragraph-by-paragraph toward the end of the table row, until a paragraph is found that has pap.fTtp set to 1. This paragraph consists of a single row end character. This row end character is linked to the table properties of the row. To create the TAP for the table row, clear a local TAP to zeros. Then the PAPX for the row end character must be fetched from an FKP, and the table sprms that are stored in this PAPX must be applied to the local TAP. The process thus far has created a TAP that describes what the table properties of the table row were at the last full save. Now apply any table sprms that were linked to the piece that contains the table row’s row end character. If pcd.prm.fComplex is 0, pcd.prm contains 1 sprm which should be applied to the local TAP if it is a table sprm. If pcd.prm.fComplex is 1, pcd.prm.igrpprl is the index of a grpprl in the CLX. If that grpprl contains any table sprms, apply them to the local TAP. After all of the sprms for the piece are applied, the local TAP contains the correct table property values for the table row. Algorithm to determine the character properties of a character in a complex file character properties recorded in the style sheet for that style are copied into a local CHP. Then, the piece containing the character is located in the piece table (plcfpcd) and the fc of the character is calculated. Using the character’s FC, the page number of the CHPX FKP that describes the character is found by searching the bin table (hplcfbteChpx). The CHPX FKP stored in that page is fetched and then the rgfc in the FKP is searched to locate the bounds of the run of exception text that encompasses the character. The CHPX for that run is then located within the FKP, and the CHPX is applied to the contents of the local CHP. The process thus far has created a CHP that describes what the character properties of the character were at the last full save. Now apply any character sprms that were linked to the piece that contains the character. If pcd.prm.fComplex is 0, pcd.prm contains 1 sprm which should be applied to the local CHP if it is a character sprm. If pcd.prm.fComplex is 1, pcd.prm.igrpprl is the index of a grpprl in the CLX. If that grpprl contains any character sprms, apply them to the local CHP. After applying all of the sprms for the piece, the local CHP contains the correct properties for the character. Characters that are within the same piece, same paragraph, and same run of exception text are guaranteed to have the same properties. This fact can be used to construct a scanner that can return the limit CPs and properties of a sequence of characters that all have the same properties. Algorithm to determine the section properties of a section in a complex file To determine which section a character belongs to and what its section properties are, it is necessary to use the CP of the character to search the plcfsed for the index i of the largest CP that is less than or equal to the character’s CP. plcfsed.rgcp[i] is the CP of the first character of the section and plcfsed.rgcp[i+1] is the CP of the character following the section mark that terminates the section (call it cpLim). Then retrieve plcfsed.rgsed[i]. The FC in this SED gives the location where the SEPX for the section is stored. Then create a local SEP with default section properties. If the sed.fc != 0xFFFFFFFF, then the sprms within the SEPX that is stored at offset sed.fc must be applied to the local SEP. The process thus far has created a SEP that describes what the section properties of the section at the last full save. Now apply any section sprms that were linked to the piece that contains the section’s section mark. If pcd.prm.fComplex is 0, pcd.prm contains 1 sprm which should be applied to the local SEP if it is a section sprm. If pcd.prm.fComplex is 1, pcd.prm.igrpprl is the index of a grpprl in the CLX. If that grpprl contains any section sprms, they should be applied to the local SEP. After applying all of the section sprms for the piece , the local SEP contains the correct section properties. Algorithm to determine the PIC of a picture in a complex file. The picture sprms contained in the prm's grpprl apply to any picture characters within the piece that have their chp.fSpec character == fTrue. The picture properties for a picture (the PIC described in the Structure Definitions) are derived by fetching the PIC stored with the picture and applying to that PIC any picture sprms linked to the piece containing the picture special character. Footnotes In Word for Windows the text of a footnote is anchored to a particular position within the document’s main text , the location of its footnote reference. There is a structure referenced by the fib, the plcffndRef, which records the locations of the footnote references within the main text address space and another structure referenced by the fib, the plcffndTxt, which records the beginning locations of corresponding footnote text within the footnote text address space . The footnote text characters in a full saved file begin at offset fib.fcMin + fib.ccpText and extends till fib.fcMin + fib.ccpText + fib.ccpFtn. In a complex fast-saved document , the footnote text begins at CP fib.ccpText and extends till fib.ccpText + fib.ccpFtn. To find the location of the ith footnote reference in the main text address space, look up the ith entry in the plcffndRef and find the location of the text corresponding to the reference within the footnote text address space by looking up the ith entry in the plcffndTxt. When there are n footnotes, the plcffndTxt structure consists of n+2 CP entries. The CP entries mark the beginning The last character of footnote text for a footnote (i.e. the character at limit CP - 1) is always a paragraph end(ASCII 13). If there are n footnotes, the n + 2nd CP entry value is always 1 greater than the n+1st CP entry value. A paragraph end (ASCII 13) is always stored at the file position marked by the n+1st CP value. When there are n footnotes, the plcffndRef structure consists of n+1 CP entries followed by n integer flags, named fAuto. The ith CP in the plcffndRef corresponds to the ith fAuto flag. The CP entries give the locations of footnote references within the main text address space. The n + 1st CP entry contains the value fib.ccpText + fib.ccpFtn + fib.ccpHdr + 1. The fAuto flag contains 1 whenever the footnote reference name is auto-generated by Word. When a footnote reference name is automatically generated by Word, Word generates the name by adding 1 to the index number of the reference in the plcffndRef and translating that number to ASCII text. When the footnote reference is auto generated, the character at the main text CP position for the footnote reference should be a footnote reference character (ASCII 5) which has a chp recorded with chp.fSpec = 1. The number of footnotes stored in a Word binary file can be found by dividing fib.cbPlcffndTxt by 4 and subtracting 1. Headers and Footers The header and footer text characters in a full saved file begin at offset fib.fcMin + fib.ccpText + fib.ccpFtn and extend till fib.fcMin + fib.ccpText + fib.ccpFtn + fib.ccpHdr. In a complex fast-saved document , the footnote text begins at CP fib.ccpText + fib.ccpFtn and extends till fib.ccpText + fib.ccpFtn + fib.ccpHdr. The plcfhdd, a table whose location and length within the file is stored in fib.fcPlcfhdd and fib.cbPlcfhdd, describes where the text of each header/footer begins. If there are n headers/footers stored in the Word file, the plcfhdd consists of n + 2 CP entries. The beginning CP of the ith header/footer is the ith CP in the plcfhdd. The limit CP (the CP of character 1 position past the end of a header/footer) of the ith header/footer is the i + 1 st CP in the plcfhdd. Note that at the limit CP - 1, Word always places a chEop as a place holder which is never displayed as part of the header/footer. This allows Word to change an existing header/footer to be empty. If there are n header/footers, the n + 2nd CP entry value is always 1 greater than the n+1st CP entry value. A paragraph end (ASCII 13) is always stored at the file position marked by the n+1st CP value. The transformation in a full saved file from a header/footer CP to an offset from the beginning of a file (fc) is fc = fib.fcMin + ccpText + ccpFtn + cp. In Word for Windows, headers/footers can be defined for a document that: 1) will act as a separator between main text and footnote text 2) will print below footnote text on a page when footnote text must be continued on a succeeding page (continuation separator) 3) will print above footnote text on a page when the text must be continued from a previous page (continuation notice) Also for each section defined for the document, distinct headers can be defined for printing on odd-numbered/right facing pages, even-numbered /left facing pages and the first page of a section. Similarly for each document section, distinct footers can be defined for printing on odd-numbered/right facing pages, even-numbered/left facing pages and the first page of a section. Within the document and the section properties of a document (the DOP and SEP) is a field, the grpfIhdt, which enumerates which of the header/footer types are defined for the document or for a particular section. The grpfIhdt in both corresponding to the bit is defined for the document or for a particular section. Definition of the bits of dop.grpfIhdt: Bit position 7 footnote separator defined when == 1 (fTrue). 6 footnote continuation separator defined when == 1 (fTrue). 5 footnote continuation notice defined when == 1 (fTrue). Definition of the bits of sep.grpfIhdt: Bit position 7 header for even pages defined when == 1 (fTrue). 6 header for odd pages defined when == 1 (fTrue). 5 footer for even pages defined when == 1 (fTrue). 4 footer for odd pages defined when == 1 (fTrue). 3 header for first page of section defined when == 1 (fTrue). 2 footer for first page of section defined when == 1 (fTrue). Given that a particular footnote separator exists, one can locate the text for that separator using the following algorithm: Initially set ihdd (index into plcfhdd) to 0. Scan bits 7, 6, and 5 of the dop.grpfIhdt in order looking for bit == 1 while you have not yet reached the bit corresponding to the separator whose text is to be located. For each such bit ==1 add 1 to ihdd. The value of ihdd that results is the index into plcfhdd that can be used to access the text of the separator. Given that a particular header/footer exists for a particular section, one can locate the text for that header/footer using the following algorithm: initially set ihdd (index into plcfhdd) to 0. scan bits 7, 6, and 5 of the dop.grpfIhdt looking for bit == 1 and add 1 to ihdd for each such bit == 1. Examine the sep.grpfIhdt of each section preceding the section of the header/footer to be located in ascending section number order, scanning bits 7, 6, 5, 4, 3, and 2 of the sep.grpfIhdt in order, adding 1 to ihdd for each bit == 1. For the section of the header/footer to be located, scan bits 7, 6, 5, 4, 3, and 2 of the sep.grpfIhdt in order looking for bit == 1 while you have not yet reached the bit corresponding to the header/footer to be located. For each such bit ==1 add 1 to ihdd. The value of ihdd that results is the index into plcfhdd that can be used to access the text of the header/footer. Page Table The plcfpgd, referenced by the fib, gives the location of page breaks within a Word document and may optionally be saved in a Word binary file. If there are n page breaks calculated for a document, the plcfpgd would consist of n+1 CP entries followed by n PGD entries. Third-party creators of Word for Windows files should not attempt to create a plcfpgd. It can only be created properly using Word for Windows' page layout routines. If a Word for Windows document is edited in any way, the plcfpgd should be deleted by setting fib.cbPlcfpgd to 0. If there are n pages breaks recorded for the document stored, the n+1st CP stored in the array of CPs for the plcfpgd will have the value fib.ccpText + fib.ccpFtn + fib.ccpHdr + 1 if the document contains footnotes or header/footers and will have the value fib.ccpText + fib.ccpFtn + fib.ccpHdr if the document contains no subdocuments. Glossary Files beginning positions within the text address space of the file of the text of glossary entries. The sttbfglsy begins with an integer count of bytes of the size of the sttbfglsy (includes the size of the integer count of bytes). If there are n glossary entries defined, there will follow n Pascal-type strings (string preceded by length byte) concatenated one after the other which store glossary entry names. The glossary entry names must be sorted in case- insensitive ascending order. (i.e. a and A are treated as equal). Also the names date and time must be included in the list of names. The name of the ith glossary entry is the ith name defined in the sttbfglsy. If there are n glossary entries, the plcfglsy, will consist of n+2 CP entries. The ith CP entry will contain the location of the beginning of the text for the ith glossary entry. The i + 1st CP entry will contain the limit CP of the ith glossary entry. The character at a CP position of limit CP - 1 is always a paragraph mark. The n + 2nd CP entry always contains fib.ccpText + fib.ccpFtn + fib.ccpHdr + 1 if there are headers, footers or footnotes stored in the glossary and contains fib.ccpText + fib.ccpFtn + fib.ccpHdr otherwise. The n+1st CP entry is always 1 less than the value of the n + 2nd entry. The text for the time and date entries will always be a single paragraph mark (ASCII 13). sttbfAssoc (Table of Associated Strings) The following are indices into a table of associated strings: ibst index description ibstAssocFileNext 0 unused ibstAssocDot 1 filename of associated template ibstAssocTitle 2 title of document ibstAssocSubject 3 subject of document ibstAssocKeyWords 4 keywords of document ibstAssocComments 5 comments of document ibstAssocAuthor 6 author of document ibstAssocLastRevBy 7 name of person who last revised the document ibstAssocDataDoc 8 filename of data document ibstAssocHeaderDoc 9 filename of header document ibstAssocCriteria1 10 packed string used by print merge record selection ibstAssocCriteria2 11 packed string used by print merge record selection ibstAssocCriteria3 12 packed string used by print merge record selection ibstAssocCriteria4 13 packed string used by print merge record selection ibstAssocCriteria5 14 packed string used by print merge record selection ibstAssocCriteria6 15 packed string used by print merge record selection ibstAssocCriteria7 16 packed string used by print merge record selection ibstAssocMax 17 maximum number of strings in string table The format of the ibstAssocCriteriaX strings are as follows: int cbIbstAssoc:8; // BYTE 0 size of ibstAssocCriteriaX string int fCompOr:1; // BYTE 1 set if condition is an or condition int iCompOp:7; // BYTE 1 index of Comparison Operator char stMergeField[]; // Name of Merge Field char stCompInfo[]; // User Supplied Comparison Information Both stMergeField and stCompInfo are variable length character arrays preceded by a length byte. BRC: Border Code The BRC is a substructure of the PAP, PIC and TC. See also the obsolete BRC10 structure. Dec Hex field type size bitfield comments 0 0 dxpLineWidth int :3 0007 width of a single line of border in units of 0.75 points. Each line in the border is this wide (e.g. a double border is three lines). Must be nonzero when brcType is nonzero. Max width currently used = 4, max width allowed = 5. brcType int :2 0018 border type code 0 none 1 single 2 thick 3 double fShadow int :1 0020 when 1, border is drawn with shadow. Must be 0 when BRC is a substructure of the TC ico int :5 07C0 color code (see chp.ico) dxpSpace int :5 F800 width of space to maintain between border and text within border. Must be 0 when BRC is a substructure of the TC. Stored in points for Windows. sizeof(BRC) == 2. BRC10: Border Code for Word for Windows 1.0 Dec Hex field type size bitfield comments 0 0 dxpLine2Width int :3 0007 width of second line of border in pixels dxpSpaceBetween int :3 0038 distance to maintain between both lines of border in pixels dxpLine1Width int :3 01C0 width of first border line in pixels dxpSpace int :5 3E00 width of space to maintain between border and text within border. Must be 0 when BRC is a substructure of the TC. fShadow int :1 4000 when 1, border is drawn with shadow. Must be 0 when BRC10 is a substructure of the TC. fSpare int :1 8000 reserved The seven types of border lines that Word for Windows 1.0 supports are coded with different sets of values for dxpLine1Width, dxpSpaceBetween, and dxpLine2 Width. The border lines and their brc10 settings follow: line type dxpLine1Width dxpSpaceBetween dxpLine2Width no border 0 0 0 single line border 1 0 0 two single line border 1 1 1 fat solid border 4 0 0 hairline border 7(special value meaning 0 0 hairline) When the no border settings are stored in the BRC, brc.fShadow and brc.dxpSpace should be set to 0. CHP/CHPX: Character Properties The CHP and the CHPX have exactly the same field structure. They differ in how the fields are interpreted. Listed below is the format of the CHP/CHPX with the interpretations for each field listed in the comment column. The CHP is never stored in Word files. It is the result of decompression operations applied to CHPXs The CHPX is stored in CHPX FKPs and within the STSH (Note: when a CHPX is stored in an FKP it is prefixed by a one-byte count of bytes that records the size of the non-zero prefix of the CHPX. Since the count of bytes must begin on an even boundary within the FKP followed by the non-zero prefix, it's guaranteed that the int and FC fields of the CHPX are aligned on an odd-byte boundary. Using normal integer or long load instructions will cause address errors on a 68000. The best technique for reconstituting the CHPX is to move the non-zero prefix to the beginning of a local instance of a CHPX that has been cleared to zeros.) Dec Hex field type size bitfield comment 0 0 fBold int :1 0001 for the CHP, text is bold when 1 , and not bold when 0. for the CHPX, text boldness is opposite of the boldness of the style's CHP when 1; same as style when 0. fItalic int :1 0002 CHP: italic when 1, not italic when 0 CHPX: opposite of style when 1, same as style when 0. fRMarkDel int :1 0004 CHP: displayed with revision mark strikethrough when 1, no revision mark strikethrough when 0. CHPX: opposite of style when 1, same as style when 0. fOutline int :1 0008 CHP: outlined when 1, not outlined when 0 CHPX: opposite of style when 1, same as style when 0. fFldVanish int :1 0010 <needs work> fSmallCaps int :1 0020 CHP: displayed with small caps when 1, no small caps when 0 CHPX: opposite of style when 1, same as style when 0. fCaps int :1 0040 CHP: displayed with caps when 1, no caps when 0 CHPX: opposite of style when 1, same as style when 0. fVanish int :1 0080 CHP: vanished when 1, not vanished when 0 CHPX: opposite of style when 1, same as style when 0. 1 1 fRMark int :1 0100 CHPX: opposite of style when 1, same as style when 0. fStrike int :1 0400 CHP: displayed with strikethrough when 1, no strikethrough when 0 CHPX: opposite of style when 1, same as style when 0. fObj int :1 0800 CHP: embedded object when 1, not an embedded object when 0 CHPX: opposite of style when 1, same as style when 0. 1 1 fBoldBi int :1 1000 for the CHP, bidi text is bold when 1 , and not bold when 0. for the CHPX, bidi text boldness is opposite of the boldness of the style's CHP when 1; same as style when 0. 1 1 fItalicBi int :1 2000 CHP: bidi text is italic when 1, not italic when 0 CHPX: opposite of style when 1, same as style when 0. 1 1 fBiDi int :1 4000 BIDI run when 1, latin run when 0 1 1 fDiacUSico int :1 8000 diacritics use latin color when 1 int :4 F000 reserved 2 2 fsIco int :1 0001 CHP: ignored CHPX: paragraph chp.ico contents are different than the style CHPs contents. fsFtc int :1 0002 CHP: ignored CHPX: chp.ftc is different fsHps int :1 0004 CHP: ignored CHPX: chp.hps is different fsKul int :1 0008 CHP: ignored CHPX: chp.kul is different fsPos int :1 0010 CHP: ignored CHPX: chp.hpsPos is different fsSpace int :1 0020 CHP: ignored CHPX: chp.qpsSpace is different fsLid int :1 0040 CHP: ignored CHPX: chp.lid is different fsIcoBi int :1 0080 CHP: ignored CHPX: paragraph chp.icoBi contents are different than the style CHPs contents. fsFtcBi int :1 0100 CHP: ignored CHPX: chp.ftcBi is different fsHpsBi int :1 0200 CHP: ignored CHPX: chp.hpsBi is different fsLidBi int :1 0400 CHP: ignored CHPX: chp.lidBi is different int :5 F800 CHP: ignored int :9 FF80 CHP: ignored 4 4 ftc WORD font code 6 6 hps WORD font size in half points 1, 62 = -2,...,57 = -7) fSysVanish int :1 0040 used by Word internally, not stored in file chp.fNumRun int :1 0080 numbers run when 1 wSpare2 int :1 0080 reserved 9 9 ico int :5 1F00 color of text: 0 Auto 9 DkBlue 1 Black 10 DkCyan 2 Blue 11 DkGreen 3 Cyan 12 DkMagenta 4 Green 13 DkRed 5 Magenta 14 DkYellow 6 Red 15 DkGray 7 Yellow 16 LtGray 8 White kul int :3 E000 underline code: 0 none 1 single 2 by word 3 double 4 dotted 10 A hpsPos BYTE position in half points; 0 for normal; positive for superscript; negative for subscripts (2's compliment signed number; 256 - hpsPos is the absolute value for negative numbers) 11 B icoBi BYTE color of Bidi text. values same as chp.ico 11 B wSpare3 BYTE reserved 12 C lid LID language identification code (see following table) 14 E ftcBi WORD bidi font code 16 10 hpsBi WORD bidi font size in half points 18 12 lidBi LID bidi language identification code (see following table) 20 14 fcPic FC when character is a picture character (character is 0x01 and chp.fSpec is 1) 20 14 fcObj FC when character is an object character (character is 0x20 and chp.fSpec is 1) 23 17 fnPic BYTE used by Word internally. 20 14 hpsLargeChp int 14 E fcPic FC when character is a picture character (character is 0x01 and chp.fSpec is 1) 14 E fcObj FC when character is an object character (character is 0x20 and chp.fSpec is 1) 17 11 fnPic BYTE used by Word internally. 14 E hpsLargeChp int sizeof(CHP) == 18 == 0x12. sizeof(CHP) == 12 == 0xC. Language Name LID Language Name LID Language Name LID Albanian 0x041c French 0x040c Norwegian - Nynorsk 0x0814 Arabic 0x0401 French, Belgian 0x080c Polish 0x0415 Chinese, Traditional 0x0404 German, Swiss 0x0807 Romanian 0x0418 Chinese, Simplified 0x0804 Greek 0x0408 Russian 0x0419 Croato-Serbian 0x041a Hebrew 0x040d Serbo-Croatian 0x081a (Latin) (cyrillic) Czech 0x0405 Hungarian 0x040e Slovak 0x041b Danish 0x0406 Icelandic 0x040f Spanish, Castilian 0x040a Dutch 0x0413 Italian 0x0410 Spanish, Mexican 0x080a Dutch, Belgian 0x0813 Italian, Swiss 0x0810 Swedish 0x041d English, Australian 0x0c09 Japanese 0x0411 Thai 0x041e English, U.K. 0x0809 Korean 0x0412 Turkish 0x041f English, U.S. 0x0409 Norwegian - Bokmal 0x0414 Urdu 0x0420 Finnish 0x040b CHP10/CHPX: Character Properties for Word for Windows 1.0 Dec Hex field type size bitfield comment 0 0 fBold int :1 0001 for the CHP, text is bold when 1 , and not bold when 0. for the CHPX, text boldness is opposite of the boldness of the style's CHP when 1; same as style when 0. fItalic int :1 0002 CHP: italic when 1, not italic when 0 CHPX: opposite of style when 1, same as style when 0. fStrike int :1 0004 CHP: displayed with strikethrough when 1, no strikethrough when 0 CHPX: opposite of style when 1, same as style when 0. fOutline int :1 0008 CHP: outlined when 1, not outlined when 0 CHPX: opposite of style when 1, same as style when 0. fFldVanish int :1 0010 <needs work> fSmallCaps int :1 0020 CHP: displayed with small caps when 1, no small caps when 0 CHPX: opposite of style when 1, same as style when 0. fCaps int :1 0040 CHP: displayed with caps when 1, no caps when 0 CHPX: opposite of style when 1, same as style when 0. fVanish int :1 0080 CHP: vanished when 1, not vanished when 0 CHPX: opposite of style when 1, same as style when 0. 1 1 fRMark int :1 0100 <needs work> fSpec int :1 0200 CHP: character is a Word special character when 1, not a special character when 0 CHPX: opposite of style when 1, same as style when 0. fsFtc int :1 0800 CHP: ignored CHPX: chp.ftc is different fsHps int :1 1000 CHP: ignored CHPX: chp.hps is different fsKul int :1 2000 CHP: ignored CHPX: chp.kul is different fsPos int :1 4000 CHP: ignored CHPX: chp.hpsPos is different fsSpace int :1 8000 CHP: ignored CHPX: chp.qpsSpace is different 2 2 ftc uns font code 4 4 hps uns char font size in half points 5 5 hpsPos uns char position in half points: 0 for normal; positive for superscript; negative for subscripts (2's complement signed number; 256 - hpsPos is the absolute value for negative numbers) 6 6 qpsSpace int :6 003F space following the character in quarter point units (range -7 through +56 qp's; represented in excess-56 notation: 63 = - 1, 62 = -2,...,57 = -7) wSpare2 int :2 00C0 reserved ico int :4 0F00 color of text: 0 Black 1 Blue 2 Cyan 3 Green 4 Magenta 5 Red 6 Yellow 7 White kul int :3 7000 underline code: 0 none 1 single 2 by word 3 double 4 dotted fSysVanish int :1 8000 used by Word internally, not stored in file 8 8 fcPic FC when character is a picture or hand- annotation character (character is 0x01 or 0x07 and chp.fSpec is 1) 11 B fnPic uns char used by Word internally. 8 8 hpsLargeChp int sizeof(CHP) == 12 == 0xC. DOP: Document Properties Dec Hex field type size bitfield default value comment 0 0 fFacingPages int :1 0001 0 1 when facing pages should be printed fPMHMainDoc int :1 0004 0 1 when doc is a main doc for Print Merge Helper, 0 when not; default = 0 grfSuppression int :2 0018 0 Default line suppression storage; 0= form letter line suppression; 1= no line suppression; default = 0 fpc int :2 0060 1 footnote position code 0 print as endnotes 1 print at bottom of page 2 print immediately beneath text int :1 0080 0 unused 1 1 grpfIhdt int :8 FF00 0 specification of document headers and footers. See explanation under Headers and Footers topic. 2 2 fFtnRestart int :1 0001 1 == 1 when footnote number is to be reset to 1 for each page nFtn int :15 FFFE 1 initial footnote number for document 4 4 irmBar BYTE 00FF 5 5 irmProps int :7 7F00 fRevMarking int :1 8000 6 6 fBackup int :1 0001 always make backup when document saved when 1. fExactCWords int :1 0002 fPagHidden int :1 0004 fPagResults int :1 0008 fLockAtn int :1 0010 fMirrorMargins int :1 0020 swap margins on left/right pages when 1. fKeepFileFormat int :1 0040 save as original file format when 1 fDfltTrueType int :1 0080 Use TrueType fonts by default 7 7 fPagSuppressTopSpacing int :1 0100 fRTLAlignment int :1 0200 Document is RTL if 1 int :6 FC00 int :7 FE00 8 8 fSpares int :16 FFFF 10 A dxaTab uns 720 twips default tab width 12 C ftcDefaultBi uns index to default font in sttb 12 C wSpare uns 14 E dxaHotZ uns 16 10 wSpare2 uns 18 12 wSpare3 uns reserved 20 14 dttmCreated DTTM 24 18 dttmRevised DTTM 28 1C dttmLastPrint DTTM 32 20 nRevision int 46 2E cPg int 48 30 rgwSpareDocSum int[2] DTTM: Date and Time (internal date format) Dec Hex field type size bitfield comment 0 0 mint unsigned :6 003F minutes (0-59) hr unsigned :5 07C0 hours (0-23) dom unsigned :5 F800 days of month (1-31) 2 2 mon unsigned :4 000F months (1-12) yr unsigned :9 1FF0 years (1900-2411)-1900 wdy unsigned :3 E000 weekday, Sunday = 0, Monday = 1, Tuesday = 2, Wednesday = 3, Thursday = 4, Friday = 5, Saturday = 6 sizeof(DTTM) == 4. FIB: File Information Block Dec Hex field type size bitfield comment 0 0 wIdent uns magic number (added values for Bidi) 2 2 nFib uns FIB version written (added value for Bidi) 4 4 nProduct uns product version written by 6 6 lid uns language stamp---localized version; In Word for Windows 1.x files this value was the nLocale. If value is < 999, then it is the nLocale, otherwise it is the lid. 8 8 pnNext PN 10 A fDot uns :1 0001 fGlsy uns :1 0002 fComplex uns :1 0004 when 1, file is in complex, fast-saved format. fHasPic uns :1 0008 file contains 1 or more pictures cQuickSaves uns :4 00F0 count of times file was quicksaved 11 B fEncrypted uns :1 0100 1 if file is encrypted, 0 if not uns :7 FF00 unused 12 C nFibBack uns new values for Bidi 14 E Spare long reserved 18 12 rgwSpare0 uns[3] reserved 24 18 fcMin FC file offset of first character of text. In non- complex files a CP can be transformed into an FC by the following transformation: fc = cp + fib.fcMin. 28 1C fcMac FC file offset of last character of text in document text stream + 1 32 20 cbMac FC file offset of last byte written to file + 1. 36 24 fcSpare0 FC reserved 40 28 fcSpare1 FC reserved 44 2C fcSpare2 FC reserved 48 30 fcSpare3 FC reserved 52 34 ccpText CP length of main document text stream 56 38 ccpFtn CP length of footnote subdocument text stream stream Note: when ccpFtn == 0 and ccpHdr == 0 and ccpMcr == 0 and ccpAtn == 0, then fib.fcMac = fib.fcMin+ fib.ccpText. If either ccpFtn != 0 or ccpHdd != 0 or ccpMcr == 0 or ccpAtn == 0, then fib.fcMac = fib.fcMin + fib.ccpText + fib.ccpFtn + fib.ccpHdd + ccpMcr + ccpAtn + 1. The two characters stored beginning at file position fib.fcMac - 2 must always be a CRLF pair(ASCII 13, ASCII 10). 72 48 ccpSpare0 CP reserved 76 4C ccpSpare1 CP reserved 80 50 ccpSpare2 CP reserved 84 54 ccpSpare3 CP reserved 88 58 fcStshfOrig FC file offset of original allocation for STSH in file. During fast save Word will attempt to reuse this allocation if STSH is small enough to fit. 92 5C cbStshfOrig uns count of bytes of original STSH allocation 94 5E fcStshf FC file offset of STSH in file. 98 62 cbStshf uns count of bytes of current STSH allocation 100 64 fcPlcffndRef FC file offset of footnote reference PLC. CPs in PLC are relative to main document text stream and give location of footnote references. The structure stored in this plc, called the FRD (footnote reference descriptor) is two byte long. 104 68 cbPlcffndRef uns count of bytes of footnote reference PLC == 0 if no footnotes defined in document. 106 6A fcPlcffndTxt FC file offset of footnote text PLC. CPs in PLC are relative to footnote subdocument text stream and give location of beginnings of footnote text for corresponding references recorded in plcffndRef. No structure is stored in this plc. There will just be n+1 FC entries in this PLC when there are n footnotes 110 6E cbPlcffndTxt uns count of bytes of footnote text PLC. == 0 if no footnotes defined in document 112 70 fcPlcfandRef FC file offset of annotation reference PLC. 116 74 cbPlcfandRef uns 118 76 fcPlcfandTxt FC file offset of annotation text PLC. 122 7A cbPlcfandTxt uns 124 7C fcPlcfsed FC file offset of section descriptor PLC. CPs in PLC are relative to main document. The length of the SED is 6 bytes. 128 80 cbPlcfsed uns count of bytes of section descriptor PLC. 8 bytes. 134 86 cbPlcfpgd uns count of bytes of page descriptor PLC. ==0 if file was never repaginated. Should not be written by third party creators of Word files. 136 88 fcPlcfphe FC file offset of PLC of paragraph heights. CPs in PLC are relative to main document text stream. Only written for files in complex format. Should not be written by third party creators of Word files. The PHE is 6 bytes long. 140 8C cbPlcfphe uns count of bytes of paragraph height PLC. ==0 when file is non-complex. 142 8E fcSttbfglsy FC file offset of glossary string table 146 92 cbSttbfglsy uns count of bytes of glossary string table. == 0 for non-glossary documents. !=0 for glossary documents. 148 94 fcPlcfglsy FC file offset of glossary PLC. CPs in PLC are relative to main document and mark the beginnings of glossary entries and are in 1-1 correspondence with entries of sttbfglsy. No structure is stored in this PLC. There will be n+1 FC entries in this PLC when there are n glossary entries. 152 98 cbPlcfglsy uns count of bytes of glossary PLC. == 0 for non-glossary documents. !=0 for glossary documents. 154 9A fcPlcfhdd FC byte offset of header PLC. CPs are relative to header subdocument and mark the beginnings of individual headers in the header subdocument. No structure is stored in this PLC. There will be n+1 FC entries in this PLC when there are n headers stored for the document. 158 9E cbPlcfhdd uns count of bytes of header PLC. == 0 if document contains no headers 160 A0 fcPlcfbteChpx FC file offset of character property bin table.plc. FCs in PLC are file offsets. Describes text of main document and all subdocuments. The BTE is 2 bytes long. 164 A4 cbPlcfbteChpx uns count of bytes of character property bin table PLC. 166 A6 fcPlcfbtePapx FC file offset of paragraph property bin table.plc. FCs in PLC are file offsets. Describes text of main document and all subdocuments. The BTE is 2 bytes long. 172 AC fcPlcfsea FC file offset of PLC reserved for private use. The SEA is 6 bytes long. 176 B0 cbPlcfsea uns count of bytes of private use PLC. 178 B2 fcSttbfffn FC 182 B6 cbSttbfffn uns 184 B8 fcPlcffldMom FC 188 BC cbPlcffldMom uns 190 BE fcPlcffldHdr FC 194 C2 cbPlcffldHdr uns 196 C4 fcPlcffldFtn FC 200 C8 cbPlcffldFtn uns 202 CA fcPlcffldAtn FC 206 CE cbPlcffldAtn uns 208 D0 fcPlcffldMcr FC 212 D4 cbPlcffldMcr uns 214 D6 fcSttbfbkmk FC 218 DA cbSttbfbkmk uns 220 DC fcPlcfbkf FC 224 E0 cbPlcfbkf uns 226 E2 fcPlcfbkl FC 230 E6 cbPlcfbkl uns 232 E8 fcCmds FC 236 EC cbCmds uns 238 EE fcPlcmcr FC 242 F2 cbPlcmcr uns 244 F4 fcSttbfmcr FC 248 F8 cbSttbfmcr uns 250 FA fcPrDrvr FC file offset of the printer driver information (names of drivers, port, etc.) 254 FE cbPrDrvr uns count of bytes of the printer driver information (names of drivers, port, etc.) 256 100 fcPrEnvPort FC file offset of the print environment in portrait mode. 260 104 cbPrEnvPort uns count of bytes of the print environment in portrait mode. 262 106 fcPrEnvLand FC file offset of the print environment in landscape mode. 268 10C fcWss FC file offset of Window Save State data structure. WSS contains dimensions of document's main text window and the last selection made by Word user. 272 110 cbWss uns count of bytes of WSS. ==0 if unable to store the window state. Should not be written by third party creators of Word files. 274 112 fcDop FC file offset of document property data structure. 278 116 cbDop uns count of bytes of document properties. 280 118 fcSttbfAssoc FC 284 11C cbSttbfAssoc uns 286 11E fcClx FC file of offset of beginning of information for complex files. Consists of an encoding of all of the prms quoted by the document followed by the plcpcd (piece table) for the document. 290 122 cbClx uns count of bytes of complex file information. == 0 if file is non-complex. 292 124 fcPlcfpgdFtn FC file offset of page descriptor PLC for footnote subdocument. CPs in PLC are relative to footnote subdocument. Should not be written by third party creators of Word files. 296 128 cbPlcfpgdFtn uns count of bytes of page descriptor PLC for footnote subdocument. ==0 if document has not been paginated. The length of the PGD is 8 bytes. 298 12A fcAutosaveSource FC file offset of the name of the original file. fcAutosaveSource and cbAutosaveSource should both be 0 if autosave is off. 302 12E cbAutosaveSource uns count of bytes of the name of the original file. 304 130 fcSpare5 FC 308 134 cbSpare5 uns 310 136 fcSpare6 FC 314 13A cbSpare6 uns 316 13C wSpare4 int 318 13E pnChpFirst PN 320 140 pnPapFirst PN 322 142 cpnBteChp PN count of CHPX FKPs recorded in file. In non-complex files if the number of entries in the plcfbteChpx is less than this, the plcfbteChpx is incomplete. plcfbtePapx is incomplete. Note: If a table does not exist in the file, its cb in the FIB is zero and its fc is equal to that of the following table (the latter equality is irrelevant, as the cb should be used to determine existence of the table). FKP: Formatted Disk Page offset (Dec) field type comments 0 rgfc array of FCs For CHPX FKPs. each FC is the limit FC of a run of exception text. For PAPX FKPs, each FC is the limit FC of a paragraph (i.e. points to the next character past an end of paragraph mark). 4 * (fkp.crun + 1) rgb array of bytes an array of bytes where each byte is the word offset of a CHPX or PAPX. For CHPXs, if the byte stored is 0, there is no difference between run's character properties and the style's character properties. For PAPXs, if the byte stored is 0, this represents a 1 line paragraph 15 pixels high with Normal style (stc == 0) whose column width is 7980 dxas. 5 * fkp.crun + 4 unused space As new runs/paragraphs are recorded in the FKP, unused space is reduced by 5 if CHPX/PAPX is already recorded and is reduced by 5 + sizeof(CHPX/PAPX) if property is not already recorded. for CHPX FKPs: 511-sizeof(grpchpx) grpchpx array of bytes grpchpx consists of all of the CHPXs stored in FKP concatenated end to end. Each CHPX is prefixed with a count of bytes which records its length. for PAPX FKPs: 511-sizeof(grppapx) grppapx array of bytes grppapx consists of all of the PAPXs stored in FKP concatenated end to end. Each PAPX begins with a count of words which records its length padded to a word boundary. 511 crun byte count of runs for CHPX FKP, count of paragraphs for PAPX FKP. The PAP is never stored in a Word file. It is derived by expanding stored PAPXs. FLD: Field Descriptor Dec Hex field type size bitfield comments 0 0 ch int 7 type of field boundary the FLDdescribes. 19 field begin mark 20 field separator 21 field end mark fDirty int :1 variant used when fld.ch == 21 (field end mark) 1 1 fDiffer int :1 01 ignored for saved file int :1 02 reserved fResultDirty int :1 04 == 1, when user has edited or formatted the result. ==0 otherwise fResultEdited int :1 08 ==1, when user has inserted text into or deleted text from the result. fLocked int :1 10 ==1, when field is locked from recalc fPrivateResult int :1 20 ==1, whenever the result of the field is never to be shown. fNested int :1 40 ==1, when field is nested within another field int :1 80 reserved sizeof(FLD) == 2. flt Field Type flt Field Type 1 unknown keyword 32 quote Current Time variable 2 possible bookmark (syntax matches bookmark 33 quote Current Page variable name) 3 bookmark reference 34 evaluate expression 4 index entry 35 insert literal text 5 footnote reference 36 Include command (Print Merge) 6 Set command (for Print Merge) 37 page reference 7 If command (for Print Merge) 38 Ask command (Print Merge) 8 create index 39 Fillin command to display prompt (Print Merge) 9 table of contents entry 40 Data command (Print Merge) 10 Style reference 41 Next command (Print Merge) 11 document reference 42 NextIf command (Print Merge) 12 sequence mark 43 SkipIf (Print Merge) 13 create table-of-contents 44 inserts number of current Print Merge record 14 quote Info variable 45 DDE reference 15 quote Titlevariable 46 DDE automatic reference 16 quote Subjectvariable 47 Inserts Glossary Entry 17 quote Author variable 48 sends characters to printer without translation 18 quote Keywords variable 49 Formula definition 19 quote Comments variable 50 Goto Button 20 quote Last Revised By variable 51 Macro Button 21 quote Creation Date variable 52 insert auto numbering field in outline format 22 quote Revision Date variable 53 insert auto numbering field in legal format 23 quote Print Date variable 54 insert auto numbering field in Arabic number format 24 quote Revision Number variable 55 reads a TIFF file 25 quote Edit Time variable 56 Link 26 quote Number of Pages variable 57 Symbol 27 quote Number of Words variable 58 Embedded Object 28 quote Number of Characters variable 59 Merge fields 29 quote File Name variable 60 User Name 30 quote Document Template Name variable 61 User Initial 31 quote Current Date variable 62 User Address 0 0 lcb long length of object (including this header) 4 4 cbHeader int length of this header (for future use) 6 6 icf int index to clipboard format of object sizeof(OBJHEADER) == 8. PAP: Paragraph Properties Dec Hex field type size bitfield comments 0 0 stc uns char style code. This is an index into the STSH structure 1 1 jc uns char Justification Code 0 left justify 1 center 2 right justify 3 left and right justify 2 2 fSideBySide uns char side-by-side paragraph 3 3 fKeep uns char keep entire paragraph on one page if possible 4 4 fKeepFollow uns char keep paragraph on same page with next paragraph if possible 5 5 fPageBreakBefore uns char start this paragraph on new page 6 6 fUnused int :4 000F reserved pcVert int :2 0030 vertical position code. Specifies coordinate frame to use when paragraphs are absolutely positioned. 0 vertical position coordinates are relative to margin 1 coordinates are relative to page 2 coordinates are relative to text. This means: relative to where the next non- APO text would have been placed if this APO did not exist. pcHorz int :2 00C0 horizontal position code. Specifies coordinate frame to use when paragraphs are absolutely positioned. 0 horiz. position coordinates are relative to column. 1 coordinates are relative to margin 2 coordinates are relative to page /* the brcp and brcl fields have been superseded by the newly defined brcLeft, brcTop, etc. fields. They remain in the PAP for compatibility with MacWord 3.0 */ 7 7 brcp uns char rectangle border codes 0 none 1 border above 2 border below 15 box around 16 bar to left of paragraph 8 8 brcl uns char border line style 0 single 1 thick 2 double 3 shadow line numbering) 12 C dxaRight int indent from right margin (signed). 14 E dxaLeft int indent from left margin (signed) 16 10 dxaLeft1 int first line indent; signed number relative to dxaLeft 18 12 dyaLine int height of line. When 0, Word will automatically allocate space to each line so that every character is completely visible. If positive, Word will set line heights so that every line is at least dyaLine dyas high. If negative, the height of each line of the paragraph will be set equal to the absolute value of dyaLine. 20 14 dyaBefore uns vertical spacing before paragraph (unsigned) 22 16 dyaAfter uns vertical spacing after paragraph (unsigned) 24 18 phe PHE height of current paragraph. 30 1E fInTable char when 1, paragraph is contained in a table row 31 1F fTtp char when 1, paragraph consists only of the row mark special character and marks the end of a table row. 32 20 ptap TAP * used internally by Word 34 22 dxaAbs int when positive, is the horizontal distance from the reference frame specified by pap.pcHorz. 0 means paragraph is positioned at the left with respect to the reference frame specified by pcHorz. Certain negative values have special meaning: -4 paragraph centered horizontally within reference frame -8 paragraph adjusted right within reference frame -12 paragraph placed immediately inside of reference frame -16 paragraph placed immediately outside of reference frame 36 24 dyaAbs int when positive, is the vertical distance from the reference frame specified by pap.pcVert. 0 means paragraph's y- position is unconstrained. . Certain negative values have special meaning: -4 paragraph is placed at top of reference frame -8 paragraph is centered vertically within reference frame -12 paragraph is placed at bottom of reference frame. 40 28 brcTop BRC specification for border above paragraph 42 2A brcLeft BRC specification for border to the left of paragraph 44 2C brcBottom BRC specification for border below paragraph 46 2E brcRight BRC specification for border to the right of paragraph 48 30 brcBetween BRC specification of border to place between conforming paragraphs. Two paragraphs conform when both have borders, their brcLeft and brcRight matches, their widths are the same, they both belong to tables or both do not, and have the same absolute positioning props. 50 32 brcBar BRC specification of border to place on outside of text when facing pages are to be displayed. 52 34 dxaFromText int horizontal distance to be maintained between an absolutely positioned paragraph and any non-absolute positioned text 54 36 dyaFromText int vertical distance to be maintained between an absolutely positioned paragraph and any non-absolute positioned text 56 38 wr byte Wrap Code for absolute objects 57 39 zz byte Reserved; currently unused 58 3A fTransparent byte Reserved, currently unused 59 3b fBiDi byte RTL paragraph when 1 59 3B bSpare byte Reserved 60 3C dyaHeight int :15 7FFF height of abs obj; 0 == Auto fMinHeight int :1 8000 0 = Exact, 1 = At Least 62 3E shd SHD shading 64 40 itbdMac int number of tabs stops defined for paragraph. Must be >= 0 and <= 50. 66 42 rgdxaTab int[itbdMax] array of positions of itbdMac tab stops. itbdMax == 50 166 A6 rgtbd char[itbdMax] array of itbdMac tab descriptors sizeof(PAP) == 216 == 0xD8. PAPX: Paragraph Property Exceptions The PAPX is stored within FKPs and within the STSH. Dec Hex field type size bitfield comments 0 0 cw byte count of words of following data in PAPX. The first byte of a PAPX is a count of words when PAPX is stored in an FKP. Count of words is used because PAPX in an FKP can contain paragraph and table sprms. Count of bytes is used because only paragraph sprms are stored in a STSH PAPX. 1 1 stc byte style code of the style from which the paragraph inherits its paragraph and character properties 2 2 phe PHE encoding of paragraph height information for paragraph. 8 8 grpprl character array a list of the sprms that encode the differences between PAP for a paragraph and the PAP for the style used. When a paragraph bound is also the end of a table row, the PAPX also contains a list of table sprms which express the difference of table row's TAP from an empty TAP that has been cleared to zeros. The table sprms are recorded in the list after all of the paragraph sprms. See Sprms definitions for list of sprms that are used in PAPXs. papx.cw is equal to (8 + sizeof(grpprl) + 1) / 2. If the size of the grpprl is odd, a byte of zero is stored immediately after the grpprl to pad the PAPX so its length in bytes is papx.cw * 2. PCD: Piece Descriptor Dec Hex field type size bitfield comment 0 0 fNoParaLast int :1 0001 when 1, means that piece contains no end of paragraph marks. fPaphNil int :1 0002 used internally by Word * int :6 1 1 fn uns char used internally by Word 2 2 fc FC file offset of beginning of piece. The size of the ith piece can be determined by subtracting rgcp[i] of the containing plcfpcd from its rgcp[i+1]. 6 6 prm PRM contains either a single sprm or else an index number of the grpprl which contains the sprms that modify the properties of the piece. 8 8 cbPCD PGD: Page Descriptor Dec Hex field type size bitfield comments 0 0 * int :5 001F fGhost int :2 0060 redefine fEmptyPage and fAllFtn. true when blank page or footnote only page * int :9 FF10 0 0 fContinue int :1 0001 1 only when footnote is continued from previous page fRight int :1 0008 1 when right hand side page fPgnRestart int :1 0010 1 when page number must be reset to 1. fEmptyPage int :1 0020 1 when section break forced page to be empty. fAllFtn int :1 0040 1 when page contains nothing but footnotes * int :1 0080 bkc int :8 FF00 section break code 2 2 lnn uns line number of first line, -1 if no line numbering 4 4 cl int count of lines into paragraph for first line. 6 6 pgn uns page number as printed 8 8 dcpDepend int number of characters at the beginning of the next page that were considered for inclusion on current page before page break was forced. sizeof(PGD) == 10 == 0xA. PHE: Paragraph Height The PHE is a substructure of the PAP and PAPX and is also stored in the PLCFPHE. Dec Hex field type size bitfield comments 0 0 fSpare int :1 0001 reserved fUnk int :1 0002 phe entry is invalid when == 1 fDiffLines int :1 0004 when 1, total height of paragraph is known but lines in paragraph have different heights. * int :5 00F8 reserved clMac int :8 FF00 when fDiffLines is 0 is number of lines in paragraph 2 2 dxaCol int width of lines in paragraph 4 4 dylLine int when fDiffLines is 0, is height of every line in paragraph is in pixels 4 4 dylHeight uns when fDiffLines is 1, is the total height in pixels of the paragraph 4 4 fStyleDirty int when PAPXs are stored in STSH, this indicates that the style containing this PAPX has changed so paragraph height information stored for paragraphs with this style are invalid. sizeof(PHE) == 6. If there is no paragraph height information stored for a paragraph, all of the fields in the PHE are set to 0. If a paragraph contains more than 127 lines, the clMac, dylLine variant cannot be used, so fDiffLines must be set to 1 and the total size of the paragraph stored in dylHeight. If a paragraph height is greater than 32767 twips, the height cannot be represented by a PHE so all fields of the PHE must be set to 0. If a new Word for Windows file is created, the PHE of every PAPX created to describe the paragraphs of the file should be set to 0. If a Word for Windows file is altered in place (a character of the file changed to a new character or a property changed), the paragraph containing the change must have its papx.phe field set to 0. 0 0 lcb long number of bytes in the PIC structure plus size of following picture data which may be a Window's metafile, a bitmap, or the filename of a TIFF file. 4 4 cbHeader unsigned number of bytes in the PIC (to allow for future expansion). 6 6 mfp.mm int 8 8 mfp.xExt int 10 A mfp.yExt int 12 C mfp.hMF int If a Windows metafiles is stored immediately following the PIC structure, the mfp is a Window's METAFILEPICT structure. When the data immediately following the PIC is a TIFF filename, mfp.mm == 98 If a bitmap is stored after the PIC, mfp.mm == 99 When the PIC describes a bitmap, mfp.xExt is the width of the bitmap in pixels and mfp.yExt is the height of the bitmap in pixels.. 14 E bm BITMAP (14 bytes) Window's bitmap structure when PIC describes a BITMAP. 14 E rcWinMF rect (8 bytes) rect for window origin and extents when metafile is stored -- ignored if 0 28 1C dxaGoal int horizontal measurement in twips of the rectangle the picture should be imaged within. 30 1E dyaGoal int vertical measurement in twips of the rectangle the picture should be imaged within. when scaling bitmaps, dxaGoal and dyaGoal may be ignored if the operation would cause the bitmap to shrink or grow by a non-power-of-two factor 32 20 mx uns horizontal scaling factor supplied by user expressed in .001% units. 34 22 my uns vertical scaling factor supplied by user expressed in .001% units. For all of the Crop values, a positive measurement means the specified border has been moved inward from its original setting and a negative measurement means the border has been moved outward from its original setting. 36 24 dxaCropLeft int the amount the picture has been cropped on the left in twips. 38 26 dyaCropTop int the amount the picture has been cropped on the top in twips. 40 28 dxaCropRight int the amount the picture has been cropped on the right in twips. 42 2A dyaCropBottom int the amount the picture has been cropped on the bottom in twips. 44 2C brcl int :4 000F Obsolete, superseded by brcTop, etc. In Word for Windows 1.x, it was the type of border to place around picture 0 single 1 thick 2 double 3 shadow fFrameEmpty int :1 0010 picture consists of a single frame int :11 reserved 52 34 brcRight BRC specification for border to the right of picture 54 36 dxaOrigin int horizontal offset of hand annotation origin 56 38 dyaOrigin int vertical offset of hand annotation origin 58 3A rgb variable array of bytes containing Window's metafile, bitmap or TIFF file filename. PLCF: Plex of CPs stored in File offset (in decimal) field type comment 0 rgfc FC[ ] given that the size of PLCF is cb and the size of the structure stored in plc is cbStruct, then the number of structure instances stored in PLCF, iMac is given by (cb - 4)/(4 + cbStruct) The number of FCs stored in the PLCF will be iMac + 1. 4*(iMac + 1) rgstruct struct[ ] array of some arbitrary structure. sizeof(PLC) == iMac(4 + cbStruct) + 4. PRM: Property Modifier The PRM has two variants. In the first variant, the PRM records a single one or two byte sprm whose opcode is less than 128. PRM: Property Modifier (variant 1) Dec Hex field type size bitfield comment 0 0 fComplex int :1 0001 set to 0 for variant 1 sprm int :7 00FE sprm opcode val int :8 FF00 sprm's second byte if necessary In the second variant, prm.fComplex is 1, and the rest of the structure records an index to a grpprl stored in the CLX (described in Complex File Format topic). PRM: Property Modifier (variant 2) Dec Hex field type size bitfield comment 0 0 fComplex int :1 0001 set to 1 for variant 2 igrpprl int :15 FFFE index to a grpprl stored in CLX portion of file. SED: Section Descriptor Dec Hex field type size bitfield comments 0 0 fSwap int :1 0001 runtime flag, indicates whether orientation should be changed before printing. 0 indicates no change, 1 indicates orientation change. fUnk int :1 0002 used internally by Word for Windows fn int :14 FFFC used internally by Word for Windows 2 2 fcSepx FC file offset to beginning of SEPX stored for section. If sed.fcSepx == 0xFFFFFFFF, the section properties for the section are equal to the standard SEP (see SEP SEP: Section Properties Dec Hex field type comments 0 0 bkc uns char break code: 0 No break 1 New column 2 New page 3 Even page 4 Odd page 1 1 fTitlePage uns char set to 1 when a title page is to be displayed 2 2 ccolM1 int number of columns in section - 1. 4 4 dxaColumns int distance that will be maintained between columns 6 6 fFacingCol char facing columns flag 6 6 bUnused1 char reserved 7 7 nfcPgn uns char page number format code: 0 Arabic 1 Roman (upper case) 2 Roman (lower case) 3 Letter (upper case) 4 Letter (lower case) 8 8 pgnStart uns user specified starting page number. 10 A fBiDi uns flag for bidi section 10 A wSpare1 uns 12 C fPgnRestart uns char set to 1 when page numbering should be restarted at the beginning of this section 13 D fEndNote uns char when 1, footnotes placed at end of section. When 0, footnotes are placed at bottom of page. 14 E lnc char line numbering code: 0 Per page 1 Restart 2 Continue 15 F grpfIhdt char specification of which headers and footers are included in this section. See explanation in Headers and Footers topic. 16 10 nLnnMod uns if 0, no line numbering, otherwise this is the line number modulus (e.g. if nLnnMod is 5, line numbers appear on line 5, 10, etc.) 18 12 dxaLnn int distance of 20 14 dyaHdrTop uns y position of top header measured from top edge of page. 22 16 dyaHdrBottom uns y position of top header measured from top edge of page. 24 18 fLBetween char when ==1, draw vertical lines between columns 25 19 vjc char vertical justification code 0 top justified 1 centered 2 fully justified vertically 3 bottom justified 26 1A lnnMin int beginning line number for section 28 1C morPage uns char orientation of pages in that section. set to 0 when portrait, 1 when landscape width of page 32 20 yaPage uns default value is 15840 twips height of page 34 22 dxaLeft uns default value is 1800 twips left margin 36 24 dxaRight uns default value is 1800 twips right margin 38 26 dyaTop int default value is 1440 twips top margin 40 28 dyaBottom int default value is 1440 twips bottom margin 42 2A dzaGutter uns default value is 0 twips gutter width 44 2C dmBinFirst uns bin number supplied from windows printer driver indicating which bin the first page of section will be printed. 46 2E dmBinOther uns bin number supplied from windows printer driver indicating which bin the pages other than the first page of section will be printed. 48 30 dxaColumnWidth uns used internally by Word. sizeof (SEP) == 50 == 0x32. The standard SEP is all zeros except: bkc 2 dyaPgn 720 twips (equivalent to .5 in) dxaPgn 720 twips fEndnote True dyaHdrTop 720 twips dyaHdrBottom 720 points SEPX: Section Property Exceptions Dec Hex field type size bitfield comment 0 0 cb byte count of bytes in remainder of SEPX. 1 1 grpprl char[ ] list of sprms that encodes the differences between the properties of a section and Word's default section properties. TAP: Table Properties Dec Hex field type size bitfield comments 0 0 jc int justification code. specifies how table row should be justified within its column. 0 left justify 1 center 2 right justify 2 2 dxaGapHalf int measures half of the white space that will be maintained between text in adjacent columns of a table row. A dxaGapHalf width of white space will be maintained on guarantees that the height of the table will be exactly absolute value of dyaRowHeight high. When 0, table will be given a height large enough to represent all of the text in all of the cells of the table. 6 6 fCaFull int :1 0001 used internally by Word fFirstRow int :1 0002 used internally by Word fLastRow int :1 0004 used internally by Word fOutline int :1 0008 used internally by Word fBiDi int :1 0010 table orientation * int :11 FFE0 reserved * int :12 FFF0 reserved 8 8 itcMac int count of cells defined for this row. ItcMac must be >= 0 and less than or equal to 32. 10 A dxaAdjust int used internally by Word 12 C rgdxaCenter int[itcMax + 1] rgdxaCenter[0] is the left boundary of cell 0 measured relative to margin.. rgdxaCenter[tap.itcMac - 1] is left boundary of last cell. rgdxaCenter[tap.itcMac] is right boundary of last cell. 78 4E rgtc TC[itcMax] array of table cell descriptors 398 18E rgshd SHD[itcMax] array of cell shades sizeof(TAP) == 462 == 0x1CE. TBD: Tab Descriptor The TBD is a substructure of the PAP. Dec Hex field type size bitfield comments 0 0 jc int :3 07 justification code 0 left tab 1 centered tab 2 right tab 3 decimal tab 4 bar tlc int :3 38 tab leader code 0 no leader 1 dotted leader 2 hyphenated leader 3 single line leader * int :2 C0 reserved sizeof(TBD) == 1. TC: Table Cell Descriptors The TC is a substructure of the TAP. Dec Hex field type size bitfield comments merged cells are consolidated and the text within the cells is interpreted as belonging to one text stream for purposes of calculating line breaks. fMerged int :1 0002 set to 1 when cell has been merged with preceding cell. fUnused int :14 FFFC reserved 2 2 brcTop BRC specification of the top border of a table cell 4 4 brcLeft BRC specification of left border of table row 6 6 brcBottom BRC specification of bottom border of table row 8 8 brcRight BRC specification f right border of table row. sizeof(TC) == 10 == 0xA. Changes to Structures BRC The previously defined BRC is obsolete and has been renamed BRC10. A new BRC is defined with new fields and field names. CHP The size of the CHP changed from 16 to 32 bits, with some spare bits added. The fStrike, hpsPos, & fSysVanish fields were moved within the CHP. A new field, fRMarkDel, is located where fStrike previously was. The fsLid and lid fields were added for the language identification code. Possible values for the lid are defined at the lid field definition. The types of several fields were changed. The ftc field was changed from an unsigned integer to a WORD. The hps field was changed from an unsigned char to a WORD. The fnPic field was changed from an unsigned integer to a BYTE. The fObj and fcObj fields were added for managing embedded objects. DOP The unused field fWide was removed. The type of the irmBar field was changed from an int to a BYTE. The spare field rgwSpare was redefined as wSpare2 and wSpare3. New fields fPMHMainDoc, grfSuppression, fKeepFileFormat, fDfltTrueType, and fPagSuppressTopSpacing were added. The page dimensions and margin fields, xaPage, yaPage, dxaLeft, dxaRight, dyaTop, dyaBottom, dxaGutter, were moved from the DOP to the SEP. The DOP dxaGutter field was renamed to dzaGutter in the SEP. DTTM This newly defined structure defines Word's internal date format. FIB The nLocale field name was changed to lid, a language identification code. If the value of this field is less than 999, it represents nLocale, otherwise it represents a lid. (Defined lid values are enumerated in the CHP structure definition.) The type of the wident, nfib, nproduct and lid (formerly nlocale) fields were changed from int to uns. The fEncrypted field was added for managing file encryption. The fcPrEnv and cbPrEnv fields were renamed fcPrDrv and cbPrDrv, respectively. The fcPrEnvPort, cbPrEnvPort, fcPrEnvLand and cbPrEnvLand were added store information about the print environment and page orientation. The fcAutosaveSource and cbAutosaveSource fields were added. The pnChpFirst and pnPapFirst fields were added. FLD The ch field was reduced from eight to seven bits, and a new bitfield, fDirty was added. New field types (flt) were defined for Link, Symbol, Embedded Object, Merge, User Name, User Initial, User Address. OBJHEADER This new structure defines the Embedded Object Properties. PAP The fields nfcSeqNumb and nnSeqNumb were added to store auto numbering information. The fields dyaFromText, wr, dyaHeight, and fMinHeight were added to store information about frames (Absolutely Positioned Objects). When converting 1.x documents with Absolutely Positioned Objects set the old dxaFromText (Distance from text) to both dxaFromText and dyaFromText. The shd field was added to store information about paragraph shading. The size of the PAP structure has changed from 210 == 0xD0 to 216 == 0xD8. PGD The type of the cl field changed from uns to int, and the type of the pgn field changed from int to uns. PIC The brcl field is obsolete. In Word for Windows 1.x, this fields stored the type of border to place around a Windows. SED The fSpare spare was changed to fSwap, a runtime flag for landscape/portrait orientation. SEP The page dimensions and margin fields, xaPage, yaPage, dxaLeft, dxaRight, dyaTop, dyaBottom, dxaGutter, were moved from the DOP to the SEP. The DOP dxaGutter field was renamed to dzaGutter in the SEP. The morPage field and the reserved bUnused2 field were added. The fAutoPgn field was changed to bUnused1. The dmBinFirst and dmBinOther fields were added to store information about the printer environment. The size of the SEP structure changed from 30 == 0x1E to 50 == 0x32. TAP The spare fields, wSpare1, wSpare2, wSpare3, wSpare4, and wSpare5, were removed. The rgshd array field was added at the end of the structure. TC The type of the rgbrc, brcTop, brcLeft, brcBottom, and brcRight fields were changed from int to BRC. Other changes Autosave Source This information is written immediately after the sttbfAssoc table and appears only in autosave files. Embedded Objects Embedded objects are a new item in the file format. The native data for an embedded object (OBJ) is stored similarly to pictures (PIC). Note the addition of the OBJHEADER structure. Hand Annotation When chp.fSpec == 1, the ASCII code 6 is a special character marking a Hand Annotation (from Pen Windows). New Sprm definitions The previous Border sprms, sprmPBrcTop, sprmPBrcLeft, sprmPBrcBottom, sprmPBrcRight, sprmPBrcBetween, sprmPBrcBar and sprmPBrcFromText, are renamed with "10" appended to each name. These sprms now refer to the BRC10 structure. New sprms values are defined for the original names, and also for sprmPicBrcTop, sprmPicBrcLeft, sprmPicBrcBottom, and sprmPicBrcRight, that refer to the redefined BRC structure. New sprms for auto-numbering paragraphs are sprmPNfcSeqNumb and sprmPnoSeqNumb. Other new paragraph property sprms are sprmPWHeightAbs, sprmPShd, sprmPDyaFromText, and sprmPDxaFromText. New character property sprms are sprmCFStrikeRM, sprmCFRMark, sprmCFFldVanish and sprmCLid. New section property sprms are sprmSDmBinFirst, sprmSDmBinOther, sprmSFAutoPgn, sprmSDyaPgn, sprmSDxaPgn, and sprmSBOrientation. The previous Table sprms sprmTDefTable and sprmTSetBrc are renamed with "10" appended to each name. New sprm values are defined for the original names. New sprms for table cell shading are sprmTDefTableShd and sprmTSetShd. sttbfAssoc Indices to the associated string table and descriptions of strings are included. sttbfFn The names for all fonts are explicitly included in the font name table. It is still true that ftc = 0 represents the "best" Roman PS font on the system, ftc = 1 represents the Symbol font, and ftc = 2 represents the "best" Swiss (Sans Serif) PS font available. Autosave Source, 12, 14, 49 sprmPFBiDi, 23 BRC, 37 sprmPicBrc, 24 CHP, 39 sprmPNfcSeqNumb, 22 chp.fBiDi, 39 sprmPNoSeqNumb, 22 chp.fBoldBi, 39 sprmPShd, 22 chp.fDiacUSico, 39 sprmPWHeightAbs, 22 chp.fItalicBi, 39 sprmSBOrientation, 24 chp.fsFtcBi, 40 sprmSDmBinFirst, 24 chp.fsHpsBi, 40 sprmSDmBinOther, 24 chp.fsIcoBi, 40 sprmSDxaLeft, 25 chp.fsLidBi, 40 sprmSDxaPgn, 24 chp.ftcBi, 41 sprmSDxaRight, 25 chp.hpsBi, 41 sprmSDyaBottom, 25 chp.icoBi, 40 sprmSDyaPgn, 24 chp.lidBi, 41 sprmSDyaTop, 25 CHP/CHPX, 38 sprmSDzaGutter, 25 DOP, 43, 44 sprmSFAutoPgn, 24 dop.fRTLAlignment, 44 sprmSFBiDi, 24 dop.ftcDefaultBi, 44 sprmSFFacingCol, 25 DTTM, 44 sprmSFRTLGutter, 24 Embedded Object, 9, 10, 39, 41, 51 sprmSXaPage, 25 FIB, 44, 45 sprmSYaPage, 25 fib.nFib, 45 sprmTDefTable, 25, 29 fib.nFibBack, 45 sprmTDefTableShd, 25, 26, 29 fib.wIdent, 45 sprmTFBiDi, 25 FLD, 50 sprmTSetBrc, 25, 31 fNumRun, 40 sprmTSetShd, 25, 31 Hand Annotation, 15, 43, 57 sttbfAssoc, 12, 14, 36 OBJHEADER, 51 sttbfFn, 11, 13 PAP, 51, 54 TAP, 60, 61 pap.fBiDi, 54 tap.fBiDi, 61 PIC, 56 TC, 61 SED, 58 SEP, 58, 59 sep.fBiDi, 59 sep.fFacingCol, 59 sep.fRTLGutter, 59 sizeof(CHP), 41 sprmCFBiDi, 24 sprmCFBoldBi, 23 sprmCFDiacColor, 24 sprmCFFldVanish, 23 sprmCFItalicBi, 23 sprmCFRMark, 23 sprmCFStrikeRM, 23 sprmCFtcBi, 23 sprmCHpsBi, 23 sprmCIcoBi, 23 sprmCLid, 23 sprmClidBi, 23 sprmMax, 25 sprmPBrc, 22 sprmPDxaFromText, 22 5/29/93 additions for Bidi version 2.0c (by Alex Morcos) 10/25/91 Reformatted document, removed revision marks and completed the summary of changes from Word for Windows 1.x to 2.0 format. 5/10/91 Updated structures and sprm table for Word for Windows 2.0 format. 1/23/90 Corrected offsets with the definition of the FIB 6/16/89 Updated structure definitions 1/9/89 Document Created